[SERVER-58750] Investigate whether InternalPlans::indexScan() executor always scans forward Created: 22/Jul/21  Updated: 29/Oct/23  Resolved: 14/Dec/21

Status: Closed
Project: Core Server
Component/s: Query Execution, Sharding
Affects Version/s: None
Fix Version/s: 5.3.0

Type: Question Priority: Major - P3
Reporter: Pierlauro Sciarelli Assignee: Pierlauro Sciarelli
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: QE 2021-08-23, QE 2021-09-06, Sharding EMEA 2021-11-29, Sharding EMEA 2021-12-13, Sharding EMEA 2021-12-27
Participants:

 Description   

Investigate whether subsequent calls of getNextDocument can result in scanning a key lower than the previous one.

The guess would be that this is not possible because index keys are sorted and - in case of yielding - it can at most happen for some previous key to be lost.

Opening this ticket to double check because SERVER-20700 introduced an additional sorting of the splitVector result (vector of keys filled in with subsequent getNextDocument calls), so it would be worth checking if the original bug is still around.



 Comments   
Comment by Githook User [ 14/Dec/21 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-58750 get rid of unnecessary sorting of split keys
Branch: master
https://github.com/mongodb/mongo/commit/bf7adca3bd56959f320eccb610668fe21c196243

Comment by David Storch [ 02/Sep/21 ]

Hi pierlauro.sciarelli! I have inspected the code more carefully, and I believe the "always scans forward" guarantee can indeed be relied upon. One important note is that this does not apply in general to any PlanExecutor, or to any call to PlanExecutorImpl::getNextDocument(). Instead, I confirmed that this guarantee exists specifically for PlanExecutor objects that were constructed via the InternalPlanner::indexScan() interface. My understanding is that the relevant sharding code for splitVector is using InternalPlanner::indexScan().

I suspect that the original problem which spawned SERVER-20700 was yielding-related, so I made sure to check that the yielding process could not cause a storage cursor to move backwards. During a yield, the underlying index cursor (a SortedDataInterface::Cursor) is saved and restored, which causes the WT cursor to be unpositioned and then repositioned. The documentation for SortedDataInterface::restore() states the following:

If the former position no longer exists, a following call to next() will return the next closest position in the direction of the scan, if any.

This sounds to me like it is guaranteeing that restore() cannot cause the position of the cursor to move backwards (or, more precisely, cannot cause the cursor to move opposite the direction of the scan). I double-checked the implementation of SortedDataInterface::Cursor for WiredTiger, and it has explicit code to implement this behavior. See here: https://github.com/mongodb/mongo/blob/d080cd863aeac79eacae88579749706d25842b1b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp#L1065-L1103

I am going to move this ticket to the sharding backlog for re-triage. Given the changes from commit https://github.com/mongodb/mongo/commit/cdb43efd361b20a0c6a9bf7ad6e14dd307b6a1a4 against this ticket, it seems like this ticket can be repurposed for both deleting the TODO comments and removing the unnecessary sort of splitKeys. If we simply close this ticket, then the "TODO SERVER-58750" comments added by commit cdb43efd36 will be orphaned.

Comment by Githook User [ 22/Jul/21 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-58750 Add TODOs for SERVER-58750
Branch: master
https://github.com/mongodb/mongo/commit/cdb43efd361b20a0c6a9bf7ad6e14dd307b6a1a4

Generated at Thu Feb 08 05:45:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.