-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
ClusterScalability Jun23-Jul6
-
200
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
None
On a resharding donor, acquiring critical section involves taking the exclusive S lock on the collection which cannot be done if there are in progress transactions involving that collection. That is, if there are in-progress transactions, the lock acquisition would block until those transactions commit or abort. Depending on how long the wait is, a donor might be unable to acquire the critical section and start blocking writes within in the critical section timeout, or might acquire the critical section too late such that the recipients cannot finish fetching and applying oplog entries within the critical section.
To work around, we should make each resharding donor abort in-progress unprepared transactions upon entering the "preparing-to-block-writes" state, similarly to the setFCV command. This is best-effort only since prepared transactions cannot be aborted since by design a participant in a cross-shard transaction cannot unilaterally decide to abort (or commit) a transaction.
- is related to
-
SERVER-106989 Make the Locker class support lockCollectionBegin() and lockCollectionComplete()
-
- Backlog
-
- related to
-
SERVER-107226 Make InterruptedDueToReshardingCriticalSection a retriable error
-
- Closed
-
-
SERVER-106990 Make ShardingRecoveryService::acquireRecoverableCriticalSectionBlockWrite support aborting unprepared transactions in between enqueuing lock request for collection X lock and acquiring the lock
-
- Blocked
-
-
SERVER-107235 Expose a way to only abort unprepared transactions that touched a specific namespace
-
- Backlog
-