[SERVER-48531] 3 way deadlock can happen between chunk splitter, prepared transactions and stepdown thread. Created: 01/Jun/20  Updated: 29/Oct/23  Resolved: 30/Jun/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.4.1, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Kshitij Gupta
Resolution: Fixed Votes: 0
Labels: bkp, sharding-interns-2020, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2
Sprint: Sharding 2020-06-15, Sharding 2020-06-29
Participants:
Case:

 Description   

Currently step down kills all conflicting user operations and some internal operations that are marked killable using setSystemOperationKillable.

  • Write operation that takes global lock in IX and X mode.
  • Read operations that takes global lock in S mode.
  • Operations(read/write) that are blocked on prepare conflict.

Step down hangs due to below three way deadlock

  1. Chunk splitter thread (_runAutosplit) performs read by holding RSTL in IX mode and is blocked by a prepared txn due to prepare conflict. ChunkSplitter internal threads are not marked killable. So, step down won't be able to kill/interrupt those internal read operations.
  2. Step down enqueues RSTL lock in X mode. And blocked behind chunk splitter internal thread.
  3. CommitTransaction cmd is waiting for RSTL lock to acquire in IX mode but blocked behind the step down thread.


 Comments   
Comment by Githook User [ 05/Aug/20 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: SERVER-48531: 3 way deadlock can happen between chunk splitter, prepared transactions and stepdown thread.

(cherry picked from commit 1e23a0f7659d67df27ae8d553f99f35e52a91c0c)
Branch: v4.4
https://github.com/mongodb/mongo/commit/a85277d4c53722830121cb3f447effa6cc70fd3d

Comment by Githook User [ 30/Jun/20 ]

Author:

{'name': 'Kshitij Gupta', 'email': 'kshitij.gupta@mongodb.com', 'username': 'kshitijng'}

Message: SERVER-48531: 3 way deadlock can happen between chunk splitter, prepared transactions and stepdown thread.
Branch: master
https://github.com/mongodb/mongo/commit/1e23a0f7659d67df27ae8d553f99f35e52a91c0c

Comment by Suganthi Mani [ 01/Jun/20 ]

2 solution currently exists to fix this problem
1) Make _runAutosplit() to set it's prepare conflict behavior as PrepareConflictBehavior::kIgnoreConflictsAllowWrites( something like this).
2) Mark the operations killable using setSystemOperationKillable (something like this)

Note: Fix has to be backported till 4.2.
CC matthew.saltz esha.maharishi

Generated at Thu Feb 08 05:17:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.