Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59108

Resolve race with transaction operation not killed after step down

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4.11, 4.2.18, 5.0.4, 5.1.0-rc0
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v5.0, v4.4, v4.2
    • Sprint:
      Repl 2021-09-06, Repl 2021-09-20, Repl 2021-10-04, Repl 2021-10-18
    • Linked BF Score:
      70

      Description

      In SERVER-50486, we added a flag on the opCtx of transaction operations to ensure that these operations would be interrupted on step down. We then check to make sure we are still the primary. The commandCanRunHere function will return true if we can accept non-local writes.

      In the stepDown code path, we first acquire the RSTL, which is where we run the killOps thread to kill the opCtx of any commands that have the flag set. Only then do we update if we can accept non-local writes or not. As a result, it seems possible for the following to happen:

      1. In the user thread t1, we add a user command to the _clients vector in ServiceContext. However, we haven't yet hit ExecCommandDatabase::_initiateCommand() and set the flag
      2. In the stepDown thread t2, we attempt to acquire RSTL and loop through all commands. Since the flag is not yet set for the command in t1, it is not killed
      3. In t1, we now set the flag and check if we can still service non-local writes. Since we still can, the command proceeds
      4. In t2, we acquire RSTL and set that we can no longer service non-local writes.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              vesselina.ratcheva Vesselina Ratcheva
              Reporter:
              xuerui.fa Xuerui Fa
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: