Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Execution
Operating System:
ALL
Backport Requested:

v8.1, v8.0, v7.3
Sprint:
QE 2023-10-02, QE 2023-10-16, QE 2024-02-19, QE 2024-03-04, QE 2024-03-18, QE 2024-04-01
Linked BF Score:
160
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Trying to keep this ticket generic, although I believe that most of the related issues / BF tickets are related to the optimization performed in ~~SERVER-77427~~.

Why didn't we see these errors before?
Mostly due to ~~SERVER-81007~~. The FSM framework was swallowing some exceptions. This was addressed in 7.1.

Affected versions
The bug that caused those errors to be shallowed (~~SERVER-79735~~) was introduced on 7.1. Thus, I would expect that if any previous version was suffering of these failures at least our testing coverage would have caught them. What I believe is the conflicting commit (~~SERVER-77427~~) was introduced on 7.1, which matches with the timeline.

About the issue
In the query framework we have at least one place in which if we detect that an operation that was going to be sent through the network is just involving the current shard, we end up executing it locally. This local execution is performed without creating a fresh operation context (i.e. reusing the existing one), which could be problematic, specially for components like the OperationShardingState that assumes that a pair of <OpCtx, nss> can only be associated with one shard version. Note that this is just an example but we could have other components doing similar assumptions.

The problematic interleaving is the following:

Router sends an operation with SV1 and nss on a certain shard.
Shard start its execution, registering the <OpCtx, nss> with SV1.
A DDL operation bump the shardVersion of that shard to SV2.
The shard continues the execution of the operation, and it reaches a point in which in theory it has to send a request to a set of shards. This set only includes itself, so it ends up performing a direct execution of the command.
We try to register the pair <OpCtx, nss> with SV2 on the OperationShardingState but because of 2. we end up failing.

Update
it even looks like we don't really need to have a bump of the shard version (step 3): some kind of query operations, I believe that the ones that use $mergeCursors, initially attach ChunkVersion::IGNORED, so if that operation needs to perform some targeting and end ups deciding to perform a local execution it, it will also hit the same failure (see BF-30079).

How should we fix it?
I asked about the semantics of command re-entrance to service-arch. They mentioned that ideally we should create a fresh operation context in this situations, to avoid hitting assumptions like the one explained above.

Another thing we could consider is reverting ~~SERVER-77427~~ if it ends up being the root cause of the failures.

depends on

SERVER-79580 Remove HostTypeRequirement::kPrimaryShard from $lookup

Closed

SERVER-83056 Confirm $lookup local read optimization doesn't miss data, migrated during the read

Closed

is caused by

SERVER-77427 Avoid going through the network when a shard is targeting only itself for a $lookup subpipeline

Closed

is depended on by

SERVER-81234 analyze_shard_key FSM workload fails after changing shard version

Closed

SERVER-81235 agg_graph_lookup FSM workload not failing with expected QueryPlanKilled error

Closed

is related to

SERVER-81007 FSM workloads no longer fail when $config.states functions throw exceptions

Closed

related to

SERVER-107054 Investigate opCtx and reentrancy issues

In Progress

(1 is related to, 1 related to)

Assignee:: Unassigned
Reporter:: Sergi Mateo Bellido
Participants:: Githook User, Ivan Fefer, Maria Prinus, Sergi Mateo Bellido
Votes:: 0 Vote for this issue
Watchers:: 16 Start watching this issue

Created:: Sep 22 2023 07:00:09 AM UTC
Updated:: Jul 02 2025 05:20:24 PM UTC
Confidence Status Last Update:: 25/Mar/24 9:14 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates