-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
ClusterScalability 27Apr-11May
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The resharding coordinator is written generally as a large future chain, with continuations that are scheduled sequentially. This means that, although various continuations may be running on different threads of the executor, they will not run concurrently, because the next continuation is scheduled only after the previous one completes. This saves us from needing to hold locks when doing reads of the coordinator's state from the main future chain, as that same future chain is the only writer, and therefore there is no race.
However, this property is violated when calling _awaitAllParticipantShardsDone in _onAbortCoordinatorAndParticipants, because this future is awaited using the WithCancellation helper. This means that it's possible for the main future chain to proceed past this wait and proceed from here when WithCancellation returns a CallbackCancelled error. However, given that this happens only when the stepdown token is cancelled, it's likely that the main executor will refuse work, and we'll actually proceed from here on the cleanup executor.
Meanwhile, the original _awaitAllParticipantShardsDone could still be doing work on a separate executor thread, still unaware that this node is stepping down because it hasn't gotten to a window where interrupts are checked. This means that we may be trying to remove the state document and update the in-memory state concurrently with the main thread's cleanup logic accessing it to perform final logging.
This leads to the issue seen in BF-43147.