[SERVER-30714] Handle step down error in ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook Created: 17/Aug/17  Updated: 30/Oct/23  Resolved: 26/Sep/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.5.11
Fix Version/s: 4.0.5, 4.1.4

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Sharding 2018-10-08
Participants:
Case:
Linked BF Score: 25

 Description   

The _shardingOnTransitionToPrimaryHook callback is invoked when a node becomes a primary. If that node is part of a sharded cluster, it will execute the "ShardingStateRecovery" step, which reads from disk the optime of the last write that the node performed against the config server (where such a write is the chunk migration commit).

The _shardingOnTransitionToPrimaryHook step is executed after the replMutex has been unlocked and because of this, it is possible that the node can actually lose the majority quorum and never become primary. Since the "ShardingStateRecovery" step performs majority reads it will fail in this case, which in turn will crash replication step-up with assert 40107.

Since this is an expected situation, the sharding code should handle it appropriately.



 Comments   
Comment by Githook User [ 05/Dec/18 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-30714 Handle 'not master' errors in ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook

(cherry picked from commit a0ebd4bb3a30fdf574fd08ab473e7d6ce1b59619)
Branch: v4.0
https://github.com/mongodb/mongo/commit/8bd1c5d455ae101d7522a5c2918738601f8c6317

Comment by Githook User [ 26/Sep/18 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-30714 Handle 'not master' errors in ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook
Branch: master
https://github.com/mongodb/mongo/commit/a0ebd4bb3a30fdf574fd08ab473e7d6ce1b59619

Generated at Thu Feb 08 04:24:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.