[SERVER-29419] Dropping unused indexes adversely affects query performance Created: 01/Jun/17  Updated: 12/Jul/17  Resolved: 02/Jun/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Performance, Querying, Stability
Affects Version/s: 3.2.11
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Bob Lunney Assignee: Mark Agarunov
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Amazon Linux:
cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2016.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2016.03"
PRETTY_NAME="Amazon Linux AMI 2016.03"

EC2 instance type: i3.4xlarge, ephemeral storage

Mongo 3.2.11, community edition
3-node replica set
replication stayed intact during the event

db.stats();

{ "db" : "video", "collections" : 15, "objects" : 189601049, "avgObjSize" : 322.20348721804805, "dataSize" : 61090119168, "storageSize" : 26507411456, "numExtents" : 0, "indexes" : 62, "indexSize" : 22200733696, "ok" : 1 }

db.coll.stats()
{
"ns" : < redacted />,
"count" : 189590,
"size" : 78869440,
"avgObjSize" : 416,
"storageSize" : 270233600,
"capped" : false,
"wiredTiger" : {
"metadata" :

{ "formatVersion" : 1 }

,
"creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
"type" : "file",
"uri" : "statistics:table:collection-25-6542812937956687151",
"LSM" :

{ "bloom filter false positives" : 0, "bloom filter hits" : 0, "bloom filter misses" : 0, "bloom filter pages evicted from cache" : 0, "bloom filter pages read into cache" : 0, "bloom filters in the LSM tree" : 0, "chunks in the LSM tree" : 0, "highest merge generation in the LSM tree" : 0, "queries that could have benefited from a Bloom filter that did not exist" : 0, "sleep for LSM checkpoint throttle" : 0, "sleep for LSM merge throttle" : 0, "total size of bloom filters" : 0 }

,
"block-manager" :

{ "allocations requiring file extension" : 108716, "blocks allocated" : 3378588, "blocks freed" : 3300241, "checkpoint size" : 24412160, "file allocation unit size" : 4096, "file bytes available for reuse" : 245805056, "file magic number" : 120897, "file major version number" : 1, "file size in bytes" : 270233600, "minor version number" : 0 }

,
"btree" :

{ "btree checkpoint generation" : 25214, "column-store fixed-size leaf pages" : 0, "column-store internal pages" : 0, "column-store variable-size RLE encoded values" : 0, "column-store variable-size deleted values" : 0, "column-store variable-size leaf pages" : 0, "fixed-record size" : 0, "maximum internal page key size" : 368, "maximum internal page size" : 4096, "maximum leaf page key size" : 2867, "maximum leaf page size" : 32768, "maximum leaf page value size" : 67108864, "maximum tree depth" : 4, "number of key/value pairs" : 0, "overflow pages" : 0, "pages rewritten by compaction" : 0, "row-store internal pages" : 0, "row-store leaf pages" : 0 }

,
"cache" :

{ "bytes currently in the cache" : 71643643, "bytes read into cache" : 60930147340, "bytes written from cache" : 86565034308, "checkpoint blocked page eviction" : 6, "data source pages selected for eviction unable to be evicted" : 25790, "hazard pointer blocked page eviction" : 1412, "in-memory page passed criteria to be split" : 415, "in-memory page splits" : 216, "internal pages evicted" : 61443, "internal pages split during eviction" : 3, "leaf pages split during eviction" : 542, "modified pages evicted" : 814217, "overflow pages read into cache" : 0, "overflow values cached in memory" : 0, "page split during eviction deepened the tree" : 1, "page written requiring lookaside records" : 0, "pages read into cache" : 2319194, "pages read into cache requiring lookaside entries" : 0, "pages requested from the cache" : 32208865760, "pages written from cache" : 3328351, "pages written requiring in-memory restoration" : 1856, "unmodified pages evicted" : 1561421 }

,
"cache_walk" :

{ "Average difference between current eviction generation when the page was last considered" : 0, "Average on-disk page image size seen" : 0, "Clean pages currently in cache" : 0, "Current eviction generation" : 0, "Dirty pages currently in cache" : 0, "Entries in the root page" : 0, "Internal pages currently in cache" : 0, "Leaf pages currently in cache" : 0, "Maximum difference between current eviction generation when the page was last considered" : 0, "Maximum page size seen" : 0, "Minimum on-disk page image size seen" : 0, "On-disk page image sizes smaller than a single allocation unit" : 0, "Pages created in memory and never written" : 0, "Pages currently queued for eviction" : 0, "Pages that could not be queued for eviction" : 0, "Refs skipped during cache traversal" : 0, "Size of the root page" : 0, "Total number of pages currently in cache" : 0 }

,
"compression" :

{ "compressed pages read" : 2253153, "compressed pages written" : 3071496, "page written failed to compress" : 0, "page written was too small to compress" : 256849, "raw compression call failed, additional data available" : 0, "raw compression call failed, no additional data available" : 0, "raw compression call succeeded" : 0 }

,
"cursor" :

{ "bulk-loaded cursor-insert calls" : 0, "create calls" : 57487, "cursor-insert key and value bytes inserted" : 47544376271, "cursor-remove key bytes removed" : 9788001, "cursor-update value bytes updated" : 0, "insert calls" : 113143604, "next calls" : 92601311, "prev calls" : 1, "remove calls" : 2466220, "reset calls" : 1934752542, "restarted searches" : 299, "search calls" : 155871005685, "search near calls" : 4727262, "truncate calls" : 0, "update calls" : 0 }

,
"reconciliation" :

{ "dictionary matches" : 0, "fast-path pages deleted" : 0, "internal page key bytes discarded using suffix compression" : 2157000, "internal page multi-block writes" : 28994, "internal-page overflow keys" : 0, "leaf page key bytes discarded using prefix compression" : 0, "leaf page multi-block writes" : 34541, "leaf-page overflow keys" : 0, "maximum blocks required for a page" : 128, "overflow values written" : 0, "page checksum matches" : 875369, "page reconciliation calls" : 1881019, "page reconciliation calls for eviction" : 57237, "pages deleted" : 36443 }

,
"session" :

{ "object compaction" : 0, "open cursor count" : 600 }

,
"transaction" :

{ "update conflicts" : 189 }

},
"nindexes" : 8,
"totalIndexSize" : 17039360,
"indexSizes" :

{ "_id_" : 6955008, "_p_userDetails_1__p_socialNetwork_1__created_at_-1" : 7241728, "broadcasterAge_1" : 36864, "_created_at_1" : 2543616, "activeUntil_1_currentViewers_-1__created_at_-1" : 69632, "activeUntil_1_currentViewers_-1__created_at_1" : 69632, "activeUntil_1_currentViewers_-1" : 53248, "activeUntil_1__created_at_1_currentViewers_-1" : 69632 }

,
"ok" : 1
}


Issue Links:
Duplicate
duplicates WT-3207 Drops with checkpoint_wait=false shou... Closed
Operating System: ALL
Steps To Reproduce:

1. Query a collection's index stats:

db.coll.aggregate( { $indexStats: {} }, { $project: { _id: 0, name: 1, "accesses.ops": 1, "accesses.since": 1 } } ).pretty()

2. Note which indexes are not in use, indicated by ops of 0 (zero).
3. Drop the unused indexes.
4. Note an increase in query response time.
5. Add the indexes back.
6. Note query response time returns to nominal levels.
7. Query a collection's index stats again, note the re-added indexes still should zero ops, while all other indexes' ops counters have increased.

Participants:

 Description   

The aggregate framework's $indexStats() operation reported several indexes with zero ops. The indexes were dropped. Immediately afterwards New Relic reported skyrocketing response times, going from a 99 percentile response time of 208 ms to 59,700 ms. All read tickets were in use.

Adding the indexes back reverted the response time to nominal levels (~200 ms). Requerying the $indexStats() of the collection still showed the indexes were not used, i.e. ops was at zero. Other indexes on the collection showed advancing ops counters.

Note this happened to a production system under load and took 17 minutes to recover.



 Comments   
Comment by Bob Lunney [ 02/Jun/17 ]

Mark,

Thanks for looking into this issue. We will upgrade from 3.2.11 to the latest 3.2 version. If the issue persists I will open a new ticket and reference this one.

Regards,

Bob

Comment by Mark Agarunov [ 02/Jun/17 ]

Hello blunney,

Thank you for providing these files. Looking over the diagnostic data and the logs, it appears that this behavior is due to the same issue as described in WT-3207. As this issue has been fixed in more recent versions (3.2.13/3.4.4), My recommendation would be to upgrade to the most recent version of MongoDB where this issue is no longer present. If upgrading does not alleviate this issue, please let us know and we will further investigate.

Thanks,
Mark

Comment by Bob Lunney [ 02/Jun/17 ]

Mark,

Requested files have been uploaded. Please let me know if there is anything else I can provide to help diagnosis the issue.

Bob

Comment by Mark Agarunov [ 01/Jun/17 ]

Hello blunney,

Thank you for the report. To get a better understanding of what might be causing this issue, could you please provide the following:

  • The complete log files from mongod when this behavior was present,
  • Please archive the $dbPath/diagnostic.data directory from the affected node.

I've created a secure upload portal so that you can send us the files.

Thanks,
Mark

Generated at Thu Feb 08 04:20:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.