[SERVER-51805] Split Chunk operation is not idempotent Created: 22/Oct/20  Updated: 06/Dec/22  Resolved: 22/Oct/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Sergi Mateo Bellido Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Sharding
Participants:
Linked BF Score: 24

 Description   

Right now we assume that the split chunk operation on the config server is idempotent , but in fact it is not: depending on where we fail, we may be able to recover from it or not.

The commitChunkSplit method triggered by a _configsvrCommitChunkSplit command is executing the following steps:
1) Apply the batch of chunk updates
2) Execute getShardAndCollectionVersion - resulting in an exhaustive on the config primary.

If a stepdown happens right after 1 and before 2, the operation will be retried (at most 3 times) and will fail, even though the splitChunk technically happened upon the first applyChunkOpsDeprecated successful execution.


Generated at Thu Feb 08 05:26:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.