[SERVER-59965] Distributed deadlock between renameCollection and multi-shard transaction Created: 15/Sep/21  Updated: 29/Oct/23  Resolved: 25/Oct/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.2.0, 5.0.4, 5.1.0-rc3

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Jordi Serra Torrens
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File 0001-SERVER-59965-repro.patch    
Issue Links:
Backports
Depends
is depended on by SERVER-58991 Acquire the critical section on the r... Closed
Documented
is documented by DOCS-14892 Investigate changes in SERVER-59965: ... Closed
Related
is related to DOCS-14907 [BACKPORT] [v5.0] Distributed deadloc... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.1, v5.0
Steps To Reproduce:

0001-SERVER-59965-repro.patch

Sprint: Sharding EMEA 2021-09-20, Sharding EMEA 2021-10-04, Sharding EMEA 2021-10-18, Sharding EMEA 2021-11-01
Participants:

 Description   

As part of a sharded renameCollection, the DDLCoordinator instructs all participant shards to enter their critical sections. When all shards have entered it, the coordinator will do some work on the configsvr and finally it will tell the shards to leave their critical section.

When running renameCollection concurrently with multi-shard transactions that affect that same collection, there exists a particular interleaving that can lead to a distributed deadlock:
1. shard0 receives the RenameCollectionParticipant command and enters its critical section
2. shard0 attempts to run an statement of the multi-shard txn. Since the critical section is taken, it will throw StaleConfig. This error will be caught on the way out of the command and it will attempt to refresh the shardVersion. However, since the critical section is taken, the refresh will block until the critical section is released.
3. shard1 runs it's part of that multi-shard transaction, which will acquire the collection lock in MODE_IX, and then stash the locks.
4. shard1 receives the RenameCollectionParticipant and attempts to enter the critical section. However, since the transaction at point 3 had stashed the collection lock, we are not able to acquire the collection lock in MODE_S needed to enter the critical section.

At this point we are deadlocked:

  • shard0 is holding the critical section and won't release until shard1 acquires theirs.
  • shard1 Is holding the collection lock in MODE_IX until the txn gets committed, which won't happen because the txn (or perhaps, rather the refresh) is not making progress on shard0 due to the critical section.

More generally, I believe this situation can occur in any DDL operation that needs to acquire the critical section in several nodes at the same time. I believe that resharding may also be affected by this.



 Comments   
Comment by Githook User [ 28/Oct/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-59965 Limit max time wait behind critical section during filtering metadata refresh in txn

(cherry picked from commit 02add56a2100bef135281938a0cadaf374279f03)
Branch: v5.0
https://github.com/mongodb/mongo/commit/2fe5ed35b58f3f879cdf4200133102a9ae18d9ca

Comment by Githook User [ 28/Oct/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-59965 Limit max time wait behind critical section during filtering metadata refresh in txn

(cherry picked from commit 02add56a2100bef135281938a0cadaf374279f03)
Branch: v5.1
https://github.com/mongodb/mongo/commit/fe4cbeb6d0fa079e80b1a300cd4ec8a56cffdd77

Comment by Githook User [ 25/Oct/21 ]

Author:

{'name': 'Jordi Serra Torrens', 'email': 'jordi.serra-torrens@mongodb.com', 'username': 'jordist'}

Message: SERVER-59965 Limit max time wait behind critical section during filtering metadata refresh in txn
Branch: master
https://github.com/mongodb/mongo/commit/02add56a2100bef135281938a0cadaf374279f03

Comment by Jordi Serra Torrens [ 16/Sep/21 ]

Proposal is to solve the deadlock by skipping this refresh (which blocks behind the critical section) in case we are in a transaction and the critical section is taken. The StaleConfig error will be propagated to the client with a TransientTransactionError label, so it will be retried.

Generated at Thu Feb 08 05:48:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.