Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30714

Handle step down error in ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook

    • Fully Compatible
    • ALL
    • v4.0, v3.6
    • Sharding 2018-10-08
    • 25

      The _shardingOnTransitionToPrimaryHook callback is invoked when a node becomes a primary. If that node is part of a sharded cluster, it will execute the "ShardingStateRecovery" step, which reads from disk the optime of the last write that the node performed against the config server (where such a write is the chunk migration commit).

      The _shardingOnTransitionToPrimaryHook step is executed after the replMutex has been unlocked and because of this, it is possible that the node can actually lose the majority quorum and never become primary. Since the "ShardingStateRecovery" step performs majority reads it will fail in this case, which in turn will crash replication step-up with assert 40107.

      Since this is an expected situation, the sharding code should handle it appropriately.

            kaloian.manassiev@mongodb.com Kaloian Manassiev
            randolph@mongodb.com Randolph Tan
            0 Vote for this issue
            6 Start watching this issue