[SERVER-21027] Reduced performance of index lookups after removing documents from collection Created: 20/Oct/15  Updated: 07/Dec/15  Resolved: 12/Nov/15

Status: Closed
Project: Core Server
Component/s: Performance, WiredTiger
Affects Version/s: 3.0.7
Fix Version/s: 3.0.8, 3.2.0-rc3

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-21442 WiredTiger changes for MongoDB 3.0.8 Closed
is depended on by WT-1973 MongoDB changes for WiredTiger 2.7.0 Closed
Related
related to SERVER-20876 Hang in scenario with sharded ttl col... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Participants:

 Description   

This ticket is a spinoff from SERVER-20876 to investigate the underlying issue that index lookups become slower after removing documents from a collection. See this comment for more details and a repro, but in summary the test

  1. creates an empty collection and index and does queries using that index
  2. inserts some number of documents into the collection
  3. removes all documents from the collection, leaving collection and index empty again
  4. repeats the same queries using the again empty index, which are observed to be significantly slower than the queries in step 1 on the new index and collection

Referring to the stack trace in the referenced comment, the underlying cause seems to be slow WT search_near operations requiring tree walks, even though the table should be empty.

The impact on user-level operations depends on the amount of overhead before getting to the WT layer in the path by which the query is done, but in the particular repro referenced above, declines of 15-50% were seen, depending on the number of documents inserted then removed. That path has a fair amount of overhead outside WT, so impact could be substantially higher if there are operations that do repeated lookups with low code path overhead outside WT.



 Comments   
Comment by Githook User [ 30/Nov/15 ]

Author:

{u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

Message: Import wiredtiger-wiredtiger-mongodb-3.0.7-9-gdeb2d81.tar.gz from wiredtiger branch mongodb-3.0

ref: cb64236..deb2d81

deb2d81 SERVER-21027 Reverse split if there are many deleted pages (3.0)
66a111e WT-2195 Fix a hang after giving up on a reverse split.
7b1398a SERVER-21027 Don't leave empty internal pages in the tree
c819d2f SERVER-21027 Fix reverse splits to keep the original child ref locked
00dfebc SERVER-21027 Reverse split if there are many deleted pages
Branch: v3.0
https://github.com/mongodb/mongo/commit/9add8acc69a119949a156b815003ecc15db75e0d

Comment by Githook User [ 27/Nov/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: Merge pull request #2330 from wiredtiger/reverse-splits-3.0_2

SERVER-21027 Reverse split if there are many deleted pages (3.0)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/deb2d8109ca59cc9e223fd4f5be19915b949c628

Comment by Githook User [ 27/Nov/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: Merge pull request #2278 from wiredtiger/SERVER-21027-fix

SERVER-21027 Don't leave empty internal pages in the tree

(cherry picked from commit ba931c1)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/7b1398a1a6ee6bd4e0624c38c5311e896a42cbfc

Comment by Githook User [ 27/Nov/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: Merge pull request #2278 from wiredtiger/SERVER-21027-fix

SERVER-21027 Don't leave empty internal pages in the tree

(cherry picked from commit ba931c1)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/7b1398a1a6ee6bd4e0624c38c5311e896a42cbfc

Comment by Githook User [ 27/Nov/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2271 from wiredtiger/reverse-split-fix

SERVER-21027 Fix reverse splits to keep the original child ref locked

(cherry picked from commit f4d20a3)
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/c819d2f9d34d8d701e986da4ea628c08239f8626

Comment by Githook User [ 27/Nov/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2260 from wiredtiger/reverse-splits

SERVER-21027 Reverse split if there are many deleted pages

(cherry picked from commit 35d46c3)

Conflicts:
src/btree/bt_delete.c
src/btree/bt_read.c
src/evict/evict_page.c
Branch: mongodb-3.0
https://github.com/wiredtiger/wiredtiger/commit/00dfebc9b099a80c0ce8bbe69ef97168eda23bfd

Comment by Githook User [ 30/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: Merge pull request #2278 from wiredtiger/SERVER-21027-fix

SERVER-21027 Don't leave empty internal pages in the tree
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/ba931c1c64869b74c8699a6eb56c88b83521e9f4

Comment by Githook User [ 30/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: Merge pull request #2278 from wiredtiger/SERVER-21027-fix

SERVER-21027 Don't leave empty internal pages in the tree
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/ba931c1c64869b74c8699a6eb56c88b83521e9f4

Comment by Githook User [ 30/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Remove code that updated snapshots for eviction.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/0ca615c0dc3c7ce331cc673edb21648faaa4d5cb

Comment by Githook User [ 30/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Don't leave empty internal pages in the tree.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/6c4abfc641fc5dded5f938bd0208e75cb9877f74

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2271 from wiredtiger/reverse-split-fix

SERVER-21027 Fix reverse splits to keep the original child ref locked
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/f4d20a30f43f23c68c3198efd1b67ea6e5490fb2

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Fix reverse splits to keep the original child ref locked until we have a hazard pointer on the parent page, as we do for regular splits.

Otherwise, a reverse split could race with eviction of the internal page.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/5fdd6fbb9bca9a2d915f1580e313f13c8c1def22

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Merge pull request #2260 from wiredtiger/reverse-splits

SERVER-21027 Reverse split if there are many deleted pages
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/35d46c3a0e7b4c87849df7283544d2e6208d5edd

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Implement @keithbostic's review comments.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/b564700ba2e36068dd81c0e1e75d9cc692cb7d4d

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Reverse split whenever more than 10% of the refs in the parent are deleted.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/d305119969ab2ca42c45a37f2a294c25cc4cdb67

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Fix a bug in visibility checks for reverse splits, where truncated pages could be discarded too early.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/c7ab7fc06676fe6ec04eb5503ebe63c4bc301ce2

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Don't try to do reverse splits during tree walks: rely on eviction to notice when there are many refs to delete.
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/889e0bb988d1fd515f7b2fd51838a1870be68273

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Don't instatiate empty pages during tree walks.

Instantiating empty pages is only required during searches, and then only so
that subsequent operations have a valid position. Doing it more often is
counter-productive because it interferes with reverse splits.

(cherry picked from commit fe015e3d2e5564603d50c34ea1145fb0e5dca52e)
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/c453ee986847fabe5a9f86490a156286a054e5a4

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Do reverse splits when deleting pages.

Specifically, if we notice a parent page has a multiple of 10 deleted children, try to do a reverse split to discard those child pages.

(cherry picked from commit dc84744680d0071f5d6f39a755e2dc6d6ece7b1a)
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/0ba7fcd74874ab5e37cb4ea64b0d8781e8e41e67

Comment by Githook User [ 29/Oct/15 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-21027 Reverse-split internal pages that contain many deleted children.

(cherry picked from commit 2b89b915cafb86e2aa22b72948f5e411597831c0)
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/68138e21d0fe1529d7f65cebd2556699e3fa854b

Generated at Thu Feb 08 03:56:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.