[SERVER-25834] Make rebalance chunk requests re-entrant Created: 26/Aug/16  Updated: 06/Dec/22  Resolved: 08/Sep/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Participants:

 Description   

Currently the 'rebalanceChunk' form of moveChunk uses the kNotIdempotent retry policy, while the 'moveChunk' form uses the kIdempotent policy. This makes the 'rebalanceChunk' form less resilient to things like config server primary failovers and network errors.

The only difference between these two is that one has a target shard specified and the other does not. We should allow 'rebalance chunk' requests, that don't have a target shard attached, to be re-run in the case of network errors, and if there's already a 'rebalance chunk' operation in progress for that same chunk, just attach to it and return its return value when finished.



 Comments   
Comment by Andy Schwerin [ 08/Sep/16 ]

This behavior is only used by autosplit, which is comfortable with the existing failure modes.

Generated at Thu Feb 08 04:10:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.