Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81335

Query operations that avoid going through the network when a shard is targeting only itself should create a fresh operation context

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Query Execution
    • ALL
    • v8.0, v7.3
    • QE 2023-10-02, QE 2023-10-16, QE 2024-02-19, QE 2024-03-04, QE 2024-03-18, QE 2024-04-01
    • 160

      Trying to keep this ticket generic, although I believe that most of the related issues / BF tickets are related to the optimization performed in SERVER-77427.

      Why didn't we see these errors before?
      Mostly due to SERVER-81007. The FSM framework was swallowing some exceptions. This was addressed in 7.1.


      Affected versions
      The bug that caused those errors to be shallowed  (SERVER-79735) was introduced on 7.1.  Thus, I would expect that if any previous version was suffering of these failures at least our testing coverage would have caught them. What I believe is the conflicting commit (SERVER-77427) was introduced on 7.1, which matches with the timeline.

      About the issue
      In the query framework we have at least one place in which if we detect that an operation that was going to be sent through the network is just involving the current shard, we end up executing it locally. This local execution is performed without creating a fresh operation context (i.e. reusing the existing one), which could be problematic, specially for components like the OperationShardingState that assumes that a pair of <OpCtx, nss> can only be associated with one shard version. Note that this is just an example but we could have other components doing similar assumptions.

      The problematic interleaving is the following:

      1. Router sends an operation with SV1 and nss on a certain shard.
      2. Shard start its execution, registering the <OpCtx, nss> with SV1.
      3. A DDL operation bump the shardVersion of that shard to SV2.
      4. The shard continues the execution of the operation, and it reaches a point in which in theory it has to send a request to a set of shards. This set only includes itself, so it ends up performing a direct execution of the command.
      5. We try to register the pair <OpCtx, nss> with SV2 on the OperationShardingState but because of 2. we end up failing.

      it even looks like we don't really need to have a bump of the shard version (step 3): some kind of query operations, I believe that the ones that use $mergeCursors, initially attach ChunkVersion::IGNORED, so if that operation needs to perform some targeting and end ups deciding to perform a local execution it, it will also hit the same failure (see BF-30079).


      How should we fix it?
      I asked about the semantics of command re-entrance to service-arch. They mentioned that ideally we should create a fresh operation context in this situations, to avoid hitting assumptions like the one explained above.

      Another thing we could consider is reverting SERVER-77427 if it ends up being the root cause of the failures.

            Unassigned Unassigned
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            0 Vote for this issue
            16 Start watching this issue