-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Catalog and Routing
-
ALL
-
CAR Team 2024-08-05, CAR Team 2024-08-19
-
0
Consider the following sequence:
- the primary node of a shard (nodeA) is coordinating a DDL operation (shardDDL1) and checks out a session Session-X from the InternalSessionPool, incrementing its txnNumber until 10;
- nodeA manages to execute the last phase of shardDDL1, but it steps down during the "release coordinator" step; under such circumstance, we may end up returning SessionX to the node's InternalSessionPool (PoolA) without deleting the recovery doc (which will still contain a reference to SessionX with txnNumber = 10)
- nodeB steps up and
- resumes shardDDL1, completing it and returning SessionX to its own PoolB (for example, with txnNumber = 12)
- later starts executing a shardDDL2, checking out SessionX from PoolB and advancing its txnNumber until 15
- steps down, leaving on the recovery document the fact that SessionX is checked out at txnNumber 15
- nodeA steps up again and
- starts serving a shardDDL3, checking out SessionX from PoolA (which is still at txnNumber 10)...
- ... while also resuming shardDDL2, which also has SessionX checked out at txnNumber 15 (due to the metadata of the recovery doc)
Under such assumption, we expect shardDDL3 experiencing TransactionTooOld errors.
- related to
-
SERVER-92687 MultiUpdateCoordinator can release a Session into the InternalSessionPool more than once which can lead to TransactionTooOld
- Open