Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43229

Merge chunk can fail if config failover just after metadata committed on primary config

    • Fully Compatible
    • ALL
    • v4.2, v4.0
    • Sharding 2019-10-07
    • 8

      The mergeChunk command will fail if there is a config failover just after _configsvrCommitChunkMerge finishes and the new primary does not have the updated metadata in its majority committed snapshot. The following is the scenario in which this happens:

      1. Shard is running mergeChunk, sends _configsvrCommitChunkMerge to the config server.
      2. Config server completes _configsvrCommitChunkMerge and updates its local metadata.
      3. Config server primary steps down immediately.
      4. Shard gets response from config server and flushes its filtering metadata before checking the response form the config. This will refresh from the new config primary, which does not have the updated metadata in its majority snapshot yet.
      5. Shard gets a write concern error from the config.
      6. The command is retried. The will resend _configsvrCommitMergeChunk to the new primary with the old chunks to be merged.
      7. The new primary now has the updated metadata in its majority commit snapshot, so will fail to find the chunks to be merged and fail the command .

            Assignee:
            janna.golden@mongodb.com Janna Golden
            Reporter:
            janna.golden@mongodb.com Janna Golden
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: