Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
None

Assigned Teams:

Sharding NYC
Operating System:
ALL
Story Points:
2
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The abortReshardCollection command triggers a shard to refresh using the _flushReshardingStateChange command. The _flushReshardingStateChange command first acquires a database and collection lock to check whether the critical section is held and again acquires these locks as part of onShardVersionMismatch() if the critical section wasn't held. These lock acquisitions can block if the shard has enqueued a strong lock. However, writes being stalled by the strong lock may be the motivation for the user having run abortReshardCollection in the first place. The abortReshardCollection command waiting for a strong lock request to be granted + released means an end-user would need to additionally run killOp on operations from internal (system) threads to have the server make forward progress, which undermines the utility of the abortReshardCollection command.

We should instead have an explicit {_shardsvrAbortReshardCollection: <reshardingUUID>} command that interacts with the DonorStateMachines and RecipientStateMachines directly. Note that the coordinator's decision is irreversible so 'pushing' out the decision as opposed to having the participant shards 'pulling' it via a shard version refresh is still safe in presence of delayed messages.

duplicates

SERVER-56638 Fix flushReshardingStateChanges critical section race

Closed

is related to

SERVER-53258 [Resharding] Reject writes in opObserver when disallowWritesForResharding is true

Closed

SERVER-54474 Introduce the _flushReshardingStateChange command

Closed

Assignee:: [DO NOT USE] Backlog - Sharding NYC
Reporter:: Max Hirschhorn
Participants:: [DO NOT USE] Backlog - Sharding NYC, Max Hirschhorn
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Apr 07 2021 01:24:26 AM UTC
Updated:: Dec 06 2022 01:26:51 AM UTC
Resolved:: May 24 2021 01:22:21 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates