Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41355

Step down should call yieldLocksForPreparedTransactions w/o holding repl mutex lock (ReplicationCoordinatorImpl::_mutex).

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.2.0-rc3, 4.3.1
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • v4.2
    • Repl 2019-06-03, Repl 2019-06-17, Repl 2019-07-01, Repl 2019-07-15, Repl 2019-07-29
    • 10

      Currently, step down calls yieldLocksForPreparedTransactions by holding both RSTL and repl mutex lock. As a result, this can deadlock with prepared txn threads that have checked out the session. Consider the below case. 

      1) Thread A (txn cmd) has checked out the session.
      2) Step down has acquired RSTL lock and repl mutex lock.
      3) Step down calls yieldLocksForPreparedTransactions which marks the thread A as killed as it has checked out the session and its transaction state is TransactionState::kPrepared
      4) Thread A tries to acquire repl mutex lock which is held by step down thread.
      5) Step down waits for thread A to check in the session, so that it can check out the session and perform lock yielding of that prepared txn. But, thread A can't check in the session as it waiting for the repl mutex lock which is not interruptible.

      SERVER-41317 describes a problem happened due to above scenario.

            Assignee:
            suganthi.mani@mongodb.com Suganthi Mani
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: