[SERVER-38409] Shard can crash at step-up due to FailedToSatisfyReadPreference exception during minOpTime recovery Created: 05/Dec/18  Updated: 27/Oct/23  Resolved: 05/Dec/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-65766 ShardingStateRecovery makes remote ca... Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:
Linked BF Score: 8

 Description   

The ReplicationCoordinatorExternalStateImpl::_shardingOnTransitionToPrimaryHook call already expects NotMaster and ShutdownInProgress errors, but it can also crash in certain cases with a FailedToSatisfyReadPreference exception if the config server is not available at shard node's step-up time.



 Comments   
Comment by Kaloian Manassiev [ 05/Dec/18 ]

The sharding minOpTime recovery procedure is used to ensure that after a shard starts up or becomes a primary, it will be able to see with certainty the chunks that it owned after the last time it donated a chunk. The way this works is that a counter indicating that the node is persisting the donation of a chunk is written before the migration is persisted on the config server and is cleared when the migration is successfully persisted.

When a node starts up, if it is discovered that the number of active "committers" of the config server metadata is > 0, the config server's primary must be consulted in order to discover whether the previous migration actually committed.

Because of this requirement, if the config server is not available in this situation, a node cannot safely continue starting up as the primary of a shard due to the risk of data loss. Therefore, this behaviour "Works as Designed".

Generated at Thu Feb 08 04:48:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.