Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48500

_shardsvrShardCollection after manual intervention can succeed without writing chunks

    • Sharding EMEA
    • Fully Compatible
    • ALL
    • 15

       The following sequence of events can lead _shardsvrShardCollection to return ok:1 without actually writing chunks if a user follows the recommended procedure for handling a ManualInterventionRequired error:

      1. Config primary sends _shardsvrShardCollection to primary shard primary node
      2. Primary shard writes chunks to config.chunks with majority write concern
      3. Primary shard steps down immediately after sending config.collections update to config server
      4. Config primary retries _shardsvrShardCollection on new primary shard (because of this idempotent retry policy) primary node before the config.collections update arrives or is majority committed
      5. Primary shard reads from config.collections with majority read concern and continues with sharding the collection because it does not see the config.collections write
      6. Primary shard throws ManualInterventionRequired when it finds chunks already exist for the namespace (from the first attempt)
      7. A user deletes the namespace's chunks before retrying shardCollection
      8. Config primary sends _shardsvrShardCollection to primary shard primary node
      9. Primary shard reads config.collections after the write from the first attempt is majority committed, assumes the collection is sharded, and returns ok:1, leaving the collection without any chunks

            Assignee:
            backlog-server-sharding-emea [DO NOT USE] Backlog - Sharding EMEA
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: