Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67296

Mark the OpCtx of the configsvr commands used to commit chunk-related DDL ops as interruptible

    • Fully Compatible
    • ALL
    • v6.0, v5.0, v4.4
    • Sharding EMEA 2022-07-11, Sharding EMEA 2022-07-25, Sharding EMEA 2022-08-08
    • 5

      The pattern that we have for these operations is always the same:

      1. We take the kChunks lock.
      2. We validate that the requested operation can be applied.
      3. We compute the new CollectionVersions.
      4. Finally, through a transaction (applyOps in the past, nowadays internal transactions) we modify one or more documents on config.chunks.

      Let's say that a thread of the primary node of the CSRS is blocked just after step 3 and the node steps down. Another node steps up and perform some changes to the chunks that are not related to the previous operation. Finally, the old primary node steps up and commits the migration, but installing an old CollectionVersion

      This problem affects the commit of the split, merge and moveChunk. I would also double check what happens for refineShardKey.

      We might need to backport this fix to older versions.

            silvia.surroca@mongodb.com Silvia Surroca
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            0 Vote for this issue
            6 Start watching this issue