Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27534

All writing operations must fail if the term changes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6.5, 3.7.6
    • Component/s: Replication, Write Ops
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.6
    • Sprint:
      Repl 2017-10-02, Repl 2017-10-23, Repl 2017-11-13, Repl 2017-12-04, Repl 2017-12-18, Query 2018-03-12, Query 2018-03-26, Query 2018-04-09, Query 2018-04-23
    • Linked BF Score:
      68

      Description

      Consider the following sequence of events during an batch insert of 1000 documents with ordered:true and w:majority writeConcern.

      1. Insert 500 documents and unlock
      2. Pause the inserting thread
      3. Another node steps up and the original primary rolls back the 500 writes already done
      4. The original primary steps back up
      5. The inserting thread then does the remaining writes which get new optimes
      6. That thread then waits for majority confirmation of the last writes, and successfully returns to the user

      In this case we've lost 500 writes that are w:majority confirmed, and we've written later ops without the earlier ops even with ordered:true. This is caused by a combination of not killing all ops (at least all writing ops) on all replSet stepdown paths, not closing all connections, and always asking "can I currently write to this namespace" rather than "have I always been able to write to this namespace since starting this op".

      This issue also effects any operations that write multiple oplog entries with a release of the global lock in between, and "no-op" ops that get the last optime after releasing the global lock. A non-exhaustive list:

      • All batch write operations (insert, update, delete)
      • Multi-update and Multi-delete
      • Agg with $out
      • MapReduce

      Potential solutions:

      1. Fail all write ops and waitForWriteConcern if the electionId (or rbid) changed since the op began
      2. Interrupt all write ops (or all ops) on all stepdown paths. Also need to either:
        a) Ensure all write ops check for interrupt every time they aquire the global lock after acquiring it (currently they check first)
        b) Make all lock acquisitions checkForInterrupt (this is planned already to support interruptable locking)
      3. Record the term at the beginning of every operation, in the logOp (and awaitReplication) code check that the term of the write matches what was recorded and abort the write if not.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              justin.seyster Justin Seyster
              Reporter:
              redbeard0531 Mathias Stearn
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: