Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-34726

Deadlock with locally stashed transaction resources during profiling

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 4.0.0-rc0
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Fully Compatible
    • ALL
    • Repl 2018-05-21, Repl 2018-06-04
    • 63

      If a dropDatabase is run concurrently with a query operation that's part of a transaction, a deadlock may occur if the query operation attempts to write profiling information.

      Consider the following situation: There are two threads, T1 and T2. T1 needs to have profiling enabled. Here's a timeline of what could happen:

      -T1 starts an operation as part of a transaction and acquires its first lock
      -T2 performs a dropDatabase and requests an X lock (though it is not granted yet because T1 holds an IX lock on the collection being dropped)
      -T1's command execution finishes and profile() is called.

      In profile() the OperationContext's transaction resources are stored in a temporary local variable, and the OperationContext is given a new RecoveryUnit and Locker. With the new Locker, profile() requests an IX lock on the system.profile collection. The new lock request is enqueued behind the dropDatabase's X lock and is not granted.

      This leaves us with the following lock cycle:

      The lock requested by profile() in T1 cannot be granted until the drop (from T2) has finished.

      The X lock requested by T2 cannot be granted until the locks in the transaction resources stored locally in T1 are released.

      The transaction resources stored in a local variable cannot be moved back to the OperationContext until profile() has completed. Since T1 is deadlocked and its transaction resources are not associated with the session, they cannot be freed (even by the transaction killer or session killer thread).

      james.wahlin and I believe that one way to fix this would be to stash the TxnResources back on the session instead of in a local variable when shouldDBProfile returns true. This way, the transaction killer or session killer would be able to free the TxnResources.

            Assignee:
            tess.avitabile@mongodb.com Tess Avitabile (Inactive)
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: