Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67296

Mark the OpCtx of the configsvr commands used to commit chunk-related DDL ops as interruptible

    XMLWordPrintable

Details

    • Fully Compatible
    • ALL
    • v6.0, v5.0, v4.4
    • Sharding EMEA 2022-07-11, Sharding EMEA 2022-07-25, Sharding EMEA 2022-08-08
    • 5

    Description

      The pattern that we have for these operations is always the same:

      1. We take the kChunks lock.
      2. We validate that the requested operation can be applied.
      3. We compute the new CollectionVersions.
      4. Finally, through a transaction (applyOps in the past, nowadays internal transactions) we modify one or more documents on config.chunks.

      Let's say that a thread of the primary node of the CSRS is blocked just after step 3 and the node steps down. Another node steps up and perform some changes to the chunks that are not related to the previous operation. Finally, the old primary node steps up and commits the migration, but installing an old CollectionVersion

      This problem affects the commit of the split, merge and moveChunk. I would also double check what happens for refineShardKey.

      We might need to backport this fix to older versions.

      Attachments

        Activity

          People

            silvia.surroca@mongodb.com Silvia Surroca
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: