Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41980

Non-transactional commands can deadlock with prepared transactions when the tickets are exhausted by the non-transactional write commands.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2.0-rc5, 4.3.1
    • Component/s: Replication
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.2
    • Sprint:
      Repl 2019-07-15, Repl 2019-07-29

      Description

      Let's assume the number of write tickets available = 1. Consider the below sequence. 

      1) Transaction gets prepared and waits to commit.  Once the prepare succeeds on primary, as a part of stashing the lock resources, we release the ticket but hold the global lock in IX mode.
      2)  Now, commands (like create, find, insert) not running in transaction comes in and acquires the ticket and global lock but gets blocked behind the prepared txn on a prepare conflict or DB/collection level lock conflict.
      3) Next, commitTransaction cmd comes in and as a part of unstashing the lock resources, the commit cmd will try to reacquire the ticket. But, it fails and gets blocked by the non-transactional ops in step no:2

      For a cross-shard transactions, the transaction coordinator keeps retrying the commitTransaction cmd until it succeeds. But due to above deadlock, there won't be any progress on the primary. The above deadlock happens on primary because the transaction violates the ordering while unstashing the lock resources where ticket is acquired with the global lock held.

      Note: The above is a problem only for a prepared txns ( commitTransaction cmd + cross-shard transaction combo) and not for unprepared txns because the transactions gets aborted either by the transaction reaper or by the higher transaction number (see SERVER-41976) which would allow step no:2 to proceed.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: