Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12516

Multi-updates may fail to detect replica set primary step-down, leading to inconsistency.

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • 2.6.0-rc0
    • 2.4.9, 2.5.5
    • Replication, Write Ops
    • None
    • ALL
    • Hide

      Start up a 2-node replica set. Connect a shell, with legacy write operations or write commands.

      for (i = 0; i < 100 * 1000; ++i) { db.foo.insert({_id: i, a: 1}) }
      db.getLastError();
      db.foo.update({}, {$inc: { a: 5 }}, false, true);  // multi-update

      From another shell, immediately run

      db.adminCommand({replSetStepDown: 30, force: true})

      Notice in the log on the primary a stack trace and the following message

      Assertion: 13312:replSet error : logOp() but not primary?

      Show
      Start up a 2-node replica set. Connect a shell, with legacy write operations or write commands. for (i = 0; i < 100 * 1000; ++i) { db.foo.insert({_id: i, a: 1}) } db.getLastError(); db.foo.update({}, {$inc: { a: 5 }}, false, true); // multi-update From another shell, immediately run db.adminCommand({replSetStepDown: 30, force: true}) Notice in the log on the primary a stack trace and the following message Assertion: 13312:replSet error : logOp() but not primary?

    Description

      If the primary steps down while in the middle of a multi-update, the operation may continue to update documents until it first attempts to log the op to the oplog. At that point, the logOp() will fail, but the database is inconsistent. The database will contain the last update, but it won't appear in the oplog, and so will not replicate. It also won't get rolled back when the new primary takes writes, because there's no trace of it in the oplog.

      A minimal option would be to make the current massert() on this condition an fassert(), to eliminate corruption.

      Later, it will be necessary to audit all insert, update and remove paths (legacy and write command) to ensure that they validate primary-ness after recovering from yields.

      Attachments

        Activity

          People

            schwerin@mongodb.com Andy Schwerin
            schwerin@mongodb.com Andy Schwerin
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: