[SERVER-83056] Confirm $lookup local read optimization doesn't miss data, migrated during the read Created: 09/Nov/23  Updated: 15/Nov/23  Resolved: 15/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Ivan Fefer Assignee: Ivan Fefer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-81335 Query operations that avoid going thr... Open
Backport Requested:
v7.2
Sprint: QE 2023-11-13, QE 2023-11-27
Participants:

 Description   

When $lookup targets only local shard for sub-pipeline, it performs local reads.

However, if the targeted collection was already read by the same query, it is possible that some storage snapshots are left inside OperationContext, Transaction, etc. that will prevent the query from reading the latest data.

We need to make sure that there is no discrepancy between routing info that we use and the data that we read.



 Comments   
Comment by Ivan Fefer [ 15/Nov/23 ]

Tests that were used to check this won't be committed, as they are very specific and also required some hacky failpoints inside mongod that I don't think belong in master branch.

Comment by Ivan Fefer [ 15/Nov/23 ]

If data is migrated during $lookup, it successfully uses up-to-date routing information.

If we set snapshot read concern, it is respected, and we target the shard that owned the data at specific cluster time.

If we are inside a transaction, $lookup into sharded collection is not allowed.

 

There are some issues with unsplittable collections, but they don't affect 7.2 release, so I am going to close this ticket and work on them in SERVER-83220

Generated at Thu Feb 08 06:51:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.