[SERVER-40223] Inserting a completed `TransactionCoordinator` into the catalog leads to invariant Created: 19/Mar/19  Updated: 29/Oct/23  Resolved: 21/Mar/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.10

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2019-03-25
Participants:

 Description   

Upon inserting a TransactionCoordinator, the catalog registers a continuation on the coordinator to remove it from the catalog when it completes. If the coordinator's deadline is too short, it is possible that the coordinator inserted is already completed, which would lead the _remove method getting invoked while still under the lock held by insert, which leads to an invariant.

As part of discovering this issue, I also realized that there is another race condition where the TransactionCoordinatorService's scheduler is shut down and the coordinator catalog stepped down just after the new TransactionCoordinator is constructed, the coordinator will not complete, because the deadline task will not run.



 Comments   
Comment by Githook User [ 22/Mar/19 ]

Author:

{'name': 'Kaloian Manassiev', 'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com'}

Message: SERVER-40223 Use the AsyncWorkScheduler to run local command when recovering a coordinator decision
Branch: master
https://github.com/mongodb/mongo/commit/be36aeb7166b2e06dd47dd0769ea28cbb7250041

Comment by Githook User [ 21/Mar/19 ]

Author:

{'name': 'Kaloian Manassiev', 'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com'}

Message: Revert "SERVER-40223 Use the AsyncWorkScheduler to run local command when recovering a coordinator decision"

This reverts commit cc2b4b907aaf788f356ec23e1b315ea5d7b2cf82.
Branch: master
https://github.com/mongodb/mongo/commit/ef0c1da8c82cf3f901dbbe17c35bfb7a6fdd8d51

Comment by Githook User [ 21/Mar/19 ]

Author:

{'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev', 'username': 'kaloianm'}

Message: SERVER-40223 Ensure that the TransactionCoordinator will always complete if its scheduler is shut down
Branch: master
https://github.com/mongodb/mongo/commit/c3d1212f1a4a258833b792a43d2d2d8a144f1adb

Comment by Githook User [ 21/Mar/19 ]

Author:

{'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev', 'username': 'kaloianm'}

Message: SERVER-40223 Use the AsyncWorkScheduler to run local command when recovering a coordinator decision

... instead of the TaskExecutor
Branch: master
https://github.com/mongodb/mongo/commit/cc2b4b907aaf788f356ec23e1b315ea5d7b2cf82

Generated at Thu Feb 08 04:54:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.