Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Critical - P2
Fix Version/s: None
Affects Version/s: 3.2.11
Component/s: Index Maintenance, Performance, Querying, Stability
Labels:
None
Environment:

Hide
Amazon Linux:
cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2016.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2016.03"
PRETTY_NAME="Amazon Linux AMI 2016.03"

EC2 instance type: i3.4xlarge, ephemeral storage

Mongo 3.2.11, community edition
3-node replica set
replication stayed intact during the event

db.stats();
{
"db" : "video",
"collections" : 15,
"objects" : 189601049,
"avgObjSize" : 322.20348721804805,
"dataSize" : 61090119168,
"storageSize" : 26507411456,
"numExtents" : 0,
"indexes" : 62,
"indexSize" : 22200733696,
"ok" : 1
}

db.coll.stats()
{
"ns" : < redacted />,
"count" : 189590,
"size" : 78869440,
"avgObjSize" : 416,
"storageSize" : 270233600,
"capped" : false,
"wiredTiger" : {
"metadata" : {
"formatVersion" : 1
},
"creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
"type" : "file",
"uri" : "statistics:table:collection-25-6542812937956687151",
"LSM" : {
"bloom filter false positives" : 0,
"bloom filter hits" : 0,
"bloom filter misses" : 0,
"bloom filter pages evicted from cache" : 0,
"bloom filter pages read into cache" : 0,
"bloom filters in the LSM tree" : 0,
"chunks in the LSM tree" : 0,
"highest merge generation in the LSM tree" : 0,
"queries that could have benefited from a Bloom filter that did not exist" : 0,
"sleep for LSM checkpoint throttle" : 0,
"sleep for LSM merge throttle" : 0,
"total size of bloom filters" : 0
},
"block-manager" : {
"allocations requiring file extension" : 108716,
"blocks allocated" : 3378588,
"blocks freed" : 3300241,
"checkpoint size" : 24412160,
"file allocation unit size" : 4096,
"file bytes available for reuse" : 245805056,
"file magic number" : 120897,
"file major version number" : 1,
"file size in bytes" : 270233600,
"minor version number" : 0
},
"btree" : {
"btree checkpoint generation" : 25214,
"column-store fixed-size leaf pages" : 0,
"column-store internal pages" : 0,
"column-store variable-size RLE encoded values" : 0,
"column-store variable-size deleted values" : 0,
"column-store variable-size leaf pages" : 0,
"fixed-record size" : 0,
"maximum internal page key size" : 368,
"maximum internal page size" : 4096,
"maximum leaf page key size" : 2867,
"maximum leaf page size" : 32768,
"maximum leaf page value size" : 67108864,
"maximum tree depth" : 4,
"number of key/value pairs" : 0,
"overflow pages" : 0,
"pages rewritten by compaction" : 0,
"row-store internal pages" : 0,
"row-store leaf pages" : 0
},
"cache" : {
"bytes currently in the cache" : 71643643,
"bytes read into cache" : 60930147340,
"bytes written from cache" : 86565034308,
"checkpoint blocked page eviction" : 6,
"data source pages selected for eviction unable to be evicted" : 25790,
"hazard pointer blocked page eviction" : 1412,
"in-memory page passed criteria to be split" : 415,
"in-memory page splits" : 216,
"internal pages evicted" : 61443,
"internal pages split during eviction" : 3,
"leaf pages split during eviction" : 542,
"modified pages evicted" : 814217,
"overflow pages read into cache" : 0,
"overflow values cached in memory" : 0,
"page split during eviction deepened the tree" : 1,
"page written requiring lookaside records" : 0,
"pages read into cache" : 2319194,
"pages read into cache requiring lookaside entries" : 0,
"pages requested from the cache" : 32208865760,
"pages written from cache" : 3328351,
"pages written requiring in-memory restoration" : 1856,
"unmodified pages evicted" : 1561421
},
"cache_walk" : {
"Average difference between current eviction generation when the page was last considered" : 0,
"Average on-disk page image size seen" : 0,
"Clean pages currently in cache" : 0,
"Current eviction generation" : 0,
"Dirty pages currently in cache" : 0,
"Entries in the root page" : 0,
"Internal pages currently in cache" : 0,
"Leaf pages currently in cache" : 0,
"Maximum difference between current eviction generation when the page was last considered" : 0,
"Maximum page size seen" : 0,
"Minimum on-disk page image size seen" : 0,
"On-disk page image sizes smaller than a single allocation unit" : 0,
"Pages created in memory and never written" : 0,
"Pages currently queued for eviction" : 0,
"Pages that could not be queued for eviction" : 0,
"Refs skipped during cache traversal" : 0,
"Size of the root page" : 0,
"Total number of pages currently in cache" : 0
},
"compression" : {
"compressed pages read" : 2253153,
"compressed pages written" : 3071496,
"page written failed to compress" : 0,
"page written was too small to compress" : 256849,
"raw compression call failed, additional data available" : 0,
"raw compression call failed, no additional data available" : 0,
"raw compression call succeeded" : 0
},
"cursor" : {
"bulk-loaded cursor-insert calls" : 0,
"create calls" : 57487,
"cursor-insert key and value bytes inserted" : 47544376271,
"cursor-remove key bytes removed" : 9788001,
"cursor-update value bytes updated" : 0,
"insert calls" : 113143604,
"next calls" : 92601311,
"prev calls" : 1,
"remove calls" : 2466220,
"reset calls" : 1934752542,
"restarted searches" : 299,
"search calls" : 155871005685,
"search near calls" : 4727262,
"truncate calls" : 0,
"update calls" : 0
},
"reconciliation" : {
"dictionary matches" : 0,
"fast-path pages deleted" : 0,
"internal page key bytes discarded using suffix compression" : 2157000,
"internal page multi-block writes" : 28994,
"internal-page overflow keys" : 0,
"leaf page key bytes discarded using prefix compression" : 0,
"leaf page multi-block writes" : 34541,
"leaf-page overflow keys" : 0,
"maximum blocks required for a page" : 128,
"overflow values written" : 0,
"page checksum matches" : 875369,
"page reconciliation calls" : 1881019,
"page reconciliation calls for eviction" : 57237,
"pages deleted" : 36443
},
"session" : {
"object compaction" : 0,
"open cursor count" : 600
},
"transaction" : {
"update conflicts" : 189
}
},
"nindexes" : 8,
"totalIndexSize" : 17039360,
"indexSizes" : {
"_id_" : 6955008,
"_p_userDetails_1__p_socialNetwork_1__created_at_-1" : 7241728,
"broadcasterAge_1" : 36864,
"_created_at_1" : 2543616,
"activeUntil_1_currentViewers_-1__created_at_-1" : 69632,
"activeUntil_1_currentViewers_-1__created_at_1" : 69632,
"activeUntil_1_currentViewers_-1" : 53248,
"activeUntil_1__created_at_1_currentViewers_-1" : 69632
},
"ok" : 1
}

Show
Amazon Linux: cat /etc/os-release NAME="Amazon Linux AMI" VERSION="2016.03" ID="amzn" ID_LIKE="rhel fedora" VERSION_ID="2016.03" PRETTY_NAME="Amazon Linux AMI 2016.03" EC2 instance type: i3.4xlarge, ephemeral storage Mongo 3.2.11, community edition 3-node replica set replication stayed intact during the event db.stats(); { "db" : "video", "collections" : 15, "objects" : 189601049, "avgObjSize" : 322.20348721804805, "dataSize" : 61090119168, "storageSize" : 26507411456, "numExtents" : 0, "indexes" : 62, "indexSize" : 22200733696, "ok" : 1 } db.coll.stats() { "ns" : < redacted />, "count" : 189590, "size" : 78869440, "avgObjSize" : 416, "storageSize" : 270233600, "capped" : false, "wiredTiger" : { "metadata" : { "formatVersion" : 1 }, "creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u", "type" : "file", "uri" : "statistics:table:collection-25-6542812937956687151", "LSM" : { "bloom filter false positives" : 0, "bloom filter hits" : 0, "bloom filter misses" : 0, "bloom filter pages evicted from cache" : 0, "bloom filter pages read into cache" : 0, "bloom filters in the LSM tree" : 0, "chunks in the LSM tree" : 0, "highest merge generation in the LSM tree" : 0, "queries that could have benefited from a Bloom filter that did not exist" : 0, "sleep for LSM checkpoint throttle" : 0, "sleep for LSM merge throttle" : 0, "total size of bloom filters" : 0 }, "block-manager" : { "allocations requiring file extension" : 108716, "blocks allocated" : 3378588, "blocks freed" : 3300241, "checkpoint size" : 24412160, "file allocation unit size" : 4096, "file bytes available for reuse" : 245805056, "file magic number" : 120897, "file major version number" : 1, "file size in bytes" : 270233600, "minor version number" : 0 }, "btree" : { "btree checkpoint generation" : 25214, "column-store fixed-size leaf pages" : 0, "column-store internal pages" : 0, "column-store variable-size RLE encoded values" : 0, "column-store variable-size deleted values" : 0, "column-store variable-size leaf pages" : 0, "fixed-record size" : 0, "maximum internal page key size" : 368, "maximum internal page size" : 4096, "maximum leaf page key size" : 2867, "maximum leaf page size" : 32768, "maximum leaf page value size" : 67108864, "maximum tree depth" : 4, "number of key/value pairs" : 0, "overflow pages" : 0, "pages rewritten by compaction" : 0, "row-store internal pages" : 0, "row-store leaf pages" : 0 }, "cache" : { "bytes currently in the cache" : 71643643, "bytes read into cache" : 60930147340, "bytes written from cache" : 86565034308, "checkpoint blocked page eviction" : 6, "data source pages selected for eviction unable to be evicted" : 25790, "hazard pointer blocked page eviction" : 1412, "in-memory page passed criteria to be split" : 415, "in-memory page splits" : 216, "internal pages evicted" : 61443, "internal pages split during eviction" : 3, "leaf pages split during eviction" : 542, "modified pages evicted" : 814217, "overflow pages read into cache" : 0, "overflow values cached in memory" : 0, "page split during eviction deepened the tree" : 1, "page written requiring lookaside records" : 0, "pages read into cache" : 2319194, "pages read into cache requiring lookaside entries" : 0, "pages requested from the cache" : 32208865760, "pages written from cache" : 3328351, "pages written requiring in-memory restoration" : 1856, "unmodified pages evicted" : 1561421 }, "cache_walk" : { "Average difference between current eviction generation when the page was last considered" : 0, "Average on-disk page image size seen" : 0, "Clean pages currently in cache" : 0, "Current eviction generation" : 0, "Dirty pages currently in cache" : 0, "Entries in the root page" : 0, "Internal pages currently in cache" : 0, "Leaf pages currently in cache" : 0, "Maximum difference between current eviction generation when the page was last considered" : 0, "Maximum page size seen" : 0, "Minimum on-disk page image size seen" : 0, "On-disk page image sizes smaller than a single allocation unit" : 0, "Pages created in memory and never written" : 0, "Pages currently queued for eviction" : 0, "Pages that could not be queued for eviction" : 0, "Refs skipped during cache traversal" : 0, "Size of the root page" : 0, "Total number of pages currently in cache" : 0 }, "compression" : { "compressed pages read" : 2253153, "compressed pages written" : 3071496, "page written failed to compress" : 0, "page written was too small to compress" : 256849, "raw compression call failed, additional data available" : 0, "raw compression call failed, no additional data available" : 0, "raw compression call succeeded" : 0 }, "cursor" : { "bulk-loaded cursor-insert calls" : 0, "create calls" : 57487, "cursor-insert key and value bytes inserted" : 47544376271, "cursor-remove key bytes removed" : 9788001, "cursor-update value bytes updated" : 0, "insert calls" : 113143604, "next calls" : 92601311, "prev calls" : 1, "remove calls" : 2466220, "reset calls" : 1934752542, "restarted searches" : 299, "search calls" : 155871005685, "search near calls" : 4727262, "truncate calls" : 0, "update calls" : 0 }, "reconciliation" : { "dictionary matches" : 0, "fast-path pages deleted" : 0, "internal page key bytes discarded using suffix compression" : 2157000, "internal page multi-block writes" : 28994, "internal-page overflow keys" : 0, "leaf page key bytes discarded using prefix compression" : 0, "leaf page multi-block writes" : 34541, "leaf-page overflow keys" : 0, "maximum blocks required for a page" : 128, "overflow values written" : 0, "page checksum matches" : 875369, "page reconciliation calls" : 1881019, "page reconciliation calls for eviction" : 57237, "pages deleted" : 36443 }, "session" : { "object compaction" : 0, "open cursor count" : 600 }, "transaction" : { "update conflicts" : 189 } }, "nindexes" : 8, "totalIndexSize" : 17039360, "indexSizes" : { "_id_" : 6955008, "_p_userDetails_1__p_socialNetwork_1__created_at_-1" : 7241728, "broadcasterAge_1" : 36864, "_created_at_1" : 2543616, "activeUntil_1_currentViewers_-1__created_at_-1" : 69632, "activeUntil_1_currentViewers_-1__created_at_1" : 69632, "activeUntil_1_currentViewers_-1" : 53248, "activeUntil_1__created_at_1_currentViewers_-1" : 69632 }, "ok" : 1 }

Operating System:
ALL
Steps To Reproduce:
Hide

1. Query a collection's index stats:

db.coll.aggregate( { $indexStats: {} }, { $project: { _id: 0, name: 1, "accesses.ops": 1, "accesses.since": 1 } } ).pretty()

2. Note which indexes are not in use, indicated by ops of 0 (zero).
3. Drop the unused indexes.
4. Note an increase in query response time.
5. Add the indexes back.
6. Note query response time returns to nominal levels.
7. Query a collection's index stats again, note the re-added indexes still should zero ops, while all other indexes' ops counters have increased.
Show
1. Query a collection's index stats: db.coll.aggregate( { $indexStats: {} }, { $project: { _id: 0, name: 1, "accesses.ops" : 1, "accesses.since" : 1 } } ).pretty() 2. Note which indexes are not in use, indicated by ops of 0 (zero). 3. Drop the unused indexes. 4. Note an increase in query response time. 5. Add the indexes back. 6. Note query response time returns to nominal levels. 7. Query a collection's index stats again, note the re-added indexes still should zero ops, while all other indexes' ops counters have increased.
Confidence Status:
None
Work Order:
0
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The aggregate framework's $indexStats() operation reported several indexes with zero ops. The indexes were dropped. Immediately afterwards New Relic reported skyrocketing response times, going from a 99 percentile response time of 208 ms to 59,700 ms. All read tickets were in use.

Adding the indexes back reverted the response time to nominal levels (~200 ms). Requerying the $indexStats() of the collection still showed the indexes were not used, i.e. ops was at zero. Other indexes on the collection showed advancing ops counters.

Note this happened to a production system under load and took 17 minutes to recover.

duplicates

WT-3207 Drops with checkpoint_wait=false should not wait for checkpoints

Closed

Assignee:: Mark Agarunov (Inactive)
Reporter:: Bob Lunney
Participants:: Bob Lunney, Mark Agarunov
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Jun 01 2017 08:34:30 PM UTC
Updated:: Jul 12 2017 09:23:25 PM UTC
Resolved:: Jun 02 2017 09:10:28 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates