[SERVER-65527] Use range bounded cursors in place of prefix search_near on unique indexes Created: 13/Apr/22 Updated: 16/Jan/24 Resolved: 26/Oct/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.2.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Deepti Hasija | Assignee: | Gregory Wlodarek |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Sprint: | Execution Team 2022-08-08, Execution Team 2022-08-22, Execution Team 2022-10-03, Storage Engines - 2022-10-31 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Unique index duplicate key check searches for the key's existence before insertion into the index. search_near cursor API is used for the purpose. With durable history, the search through a table with a lot of deleted content became more expensive hence causing regression with insertion into the unique indexes. A prefix-search API was added that would cause search_near to early exit if the search was past the prefix. The prefix-search is being replaced by a feature to configure bounds on a cursor. An existing prefix search can be replaced with a cursor bound on the prefix. This ticket will do so for the above-mentioned unique index usage in the server. |
| Comments |
| Comment by Githook User [ 26/Oct/22 ] | ||||||||||||
|
Author: {'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}Message: | ||||||||||||
| Comment by Sulabh Mahajan [ 25/Aug/22 ] | ||||||||||||
|
We have decided to put down this ticket for more time. We want to get feedback from bounded cursors on | ||||||||||||
| Comment by Jie Chen [ 23/Aug/22 ] | ||||||||||||
|
From my understanding, I can give a go at explaining the differences in performance, especially why lower bound have little to no performance increase compared to a regular search near.
Edit: When a lower bound is set, WT would internally call cursor_row_search() which attempts to position the cursor closest to the key (irrelevant of visibility). Furthermore the range cursor will check the visibility of the key. If the current position is not visible, we will perform a next() to find a visible key. My assumption is that the next() is taking a long time The range cursor project will look into this. | ||||||||||||
| Comment by Sulabh Mahajan [ 22/Aug/22 ] | ||||||||||||
|
I scaled the repro up by a factor of 10 to get more input into performance differences, here is the time taken by the unique index updates:
I suspect that the versions where we set both the bounds would have been faster than the master if we would not been making a separate copy for the upper bound (prefix key's last byte incremented). The part that is unexpected is the version where we set the lower bound and follow with a next (I think it is the same for search-near) has to skip through a lot of records for some reason. This version is almost as inefficient as not having any bounds or prefix search: lower-bound-next = Lower bound only then next() call to find the record | ||||||||||||
| Comment by Sulabh Mahajan [ 12/Aug/22 ] | ||||||||||||
|
Here is the update from the work so far. I have tried to follow 2/3 approaches:
I have run MongoDB patch tests and all approaches seem to work and not produce any immediately obvious bugs. I am still using the repro script from The next steps:
| ||||||||||||
| Comment by Yujin Kang Park [ 06/Jul/22 ] | ||||||||||||
|
Depends on |