- Wait for all recipients to enter the steady state. I believe that corresponds to the continuation that comes after this one.
Yes, once all recipients have reached state RecipientStateEnum::kSteadyState, the resharding operation becomes eligible to be committed. (There'd be no benefit to blocking writes on donor shards while the recipients are still doing their initial collection clone.)
To be slightly more precise, it corresponds to _reshardingCoordinatorObserver->awaitAllRecipientsFinishedApplying() becoming ready.
- Start a new observer service (this ticket), and keep it running so long as the resharding operation is not cancelled and the coordinator has not entered a critical section. This would postpone the execution of any continuation after this point.
- Once the maximum is less than the threshold (i.e., remainingReshardingOperationTimeMillisThreshold), notify the coordinator so that it enters the critical section. I'm not sure what state corresponds to the critical section for the coordinator and donors.
The action of the resharding coordinator this ticket should postpone is specifically the transition to CoordinatorStateEnum::kMirroring (to be renamed to CoordinatorStateEnum::kBlockingWrites or similar as part of SERVER-54512). The goal of this ticket is defer donor shards starting to block writes until it appears that the recipient shards are mostly caught up.
|
|
max.hirschhorn and lamont.nelson, here is my understanding of what this ticket should do, along with a few questions:
- Wait for all recipients to enter the steady state. I believe that corresponds to the continuation that comes after this one.
- Start a new observer service (this ticket), and keep it running so long as the resharding operation is not cancelled and the coordinator has not entered a critical section. This would postpone the execution of any continuation after this point.
- Gather the currentOp output for each recipient, and calculate the maximum of the collected remainingOperationTimeEstimatedMillis for the recipients.
- Once the maximum is less than the threshold (i.e., remainingReshardingOperationTimeMillisThreshold), notify the coordinator so that it enters the critical section. I'm not sure what state corresponds to the critical section for the coordinator and donors.
- Once the coordinator enters the critical section and persists the state change, donors will get notified through the coordinator document.
|