[SERVER-81335] Query operations that avoid going through the network when a shard is targeting only itself should create a fresh operation context Created: 22/Sep/23  Updated: 02/Feb/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sergi Mateo Bellido Assignee: Mihai Andrei
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on SERVER-79580 Remove HostTypeRequirement::kPrimaryS... Closed
depends on SERVER-83056 Confirm $lookup local read optimizati... Closed
is depended on by SERVER-81234 analyze_shard_key FSM workload fails ... Closed
is depended on by SERVER-81235 agg_graph_lookup FSM workload not fai... Closed
Problem/Incident
is caused by SERVER-77427 Avoid going through the network when ... Closed
Related
is related to SERVER-81007 FSM workloads no longer fail when $co... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Backport Requested:
v7.3
Sprint: QE 2023-10-02, QE 2023-10-16, QE 2024-02-19
Participants:
Linked BF Score: 160

 Description   

Trying to keep this ticket generic, although I believe that most of the related issues / BF tickets are related to the optimization performed in SERVER-77427.

Why didn't we see these errors before?
Mostly due to SERVER-81007. The FSM framework was swallowing some exceptions. This was addressed in 7.1.

 

Affected versions
The bug that caused those errors to be shallowed  (SERVER-79735) was introduced on 7.1.  Thus, I would expect that if any previous version was suffering of these failures at least our testing coverage would have caught them. What I believe is the conflicting commit (SERVER-77427) was introduced on 7.1, which matches with the timeline.

 
About the issue
In the query framework we have at least one place in which if we detect that an operation that was going to be sent through the network is just involving the current shard, we end up executing it locally. This local execution is performed without creating a fresh operation context (i.e. reusing the existing one), which could be problematic, specially for components like the OperationShardingState that assumes that a pair of <OpCtx, nss> can only be associated with one shard version. Note that this is just an example but we could have other components doing similar assumptions.

The problematic interleaving is the following:

  1. Router sends an operation with SV1 and nss on a certain shard.
  2. Shard start its execution, registering the <OpCtx, nss> with SV1.
  3. A DDL operation bump the shardVersion of that shard to SV2.
  4. The shard continues the execution of the operation, and it reaches a point in which in theory it has to send a request to a set of shards. This set only includes itself, so it ends up performing a direct execution of the command.
  5. We try to register the pair <OpCtx, nss> with SV2 on the OperationShardingState but because of 2. we end up failing.

Update
it even looks like we don't really need to have a bump of the shard version (step 3): some kind of query operations, I believe that the ones that use $mergeCursors, initially attach ChunkVersion::IGNORED, so if that operation needs to perform some targeting and end ups deciding to perform a local execution it, it will also hit the same failure (see BF-30079).

 

How should we fix it?
I asked about the semantics of command re-entrance to service-arch. They mentioned that ideally we should create a fresh operation context in this situations, to avoid hitting assumptions like the one explained above.

Another thing we could consider is reverting SERVER-77427 if it ends up being the root cause of the failures.



 Comments   
Comment by Ivan Fefer [ 15/Nov/23 ]

Currently the main way to test this is $lookup.
However, currently it doesn't support unsplittable collections, so I am blocking this ticket until we do SERVER-79580.
It will give us good test coverage for potential solutions.

Comment by Githook User [ 27/Sep/23 ]

Author:

{'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'}

Message: SERVER-81335 Fallback to remote read if local read for sub pipeline fails with IllegalChangeToExpectedShardVersion
Branch: master
https://github.com/mongodb/mongo/commit/b5330fcbc5989e6f7750bba166379ea75d47502f

Comment by Ivan Fefer [ 27/Sep/23 ]

Revert is merged to 7.1 branch: https://github.com/mongodb/mongo/commit/aae297113407b219885bfe44f25a1fb377e04557

The ticket remains open, so we can properly fix this issue in master for the future releases.

Comment by Maria Prinus [ 26/Sep/23 ]

Hi ivan.fefer@mongodb.com,

What is the actual status of this one? Do you still need to merge the change/revert to the 7.1 branch?

Generated at Thu Feb 08 06:46:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.