Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30049

applyOperation_inlock() allows exceptions from Collection::insertDocument() to percolate to caller

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.2.17, 3.4.7, 3.5.11
    • Affects Version/s: 3.4.4
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Hide
      ops = [];
      for (var i = 0; i < 10; ++i) {
          ops.push({op: "i", o: {_id: i}, ts: Timestamp(), v: 2, ns: "test.test"});
      }
      ops.push({ns: "test.$cmd", op: "c", o: {applyOps: []}});
      db.test.drop();
      db.test.insert({});
      db.adminCommand({setParameter: 1, traceWriteConflictExceptions: true});
      db.adminCommand({configureFailPoint: 'WTWriteConflictException', mode: {activationProbability: 0.05}});
      db.adminCommand({applyOps: ops})
      
      Show
      ops = []; for ( var i = 0; i < 10; ++i) { ops.push({op: "i" , o: {_id: i}, ts: Timestamp(), v: 2, ns: "test.test" }); } ops.push({ns: "test.$cmd" , op: "c" , o: {applyOps: []}}); db.test.drop(); db.test.insert({}); db.adminCommand({setParameter: 1, traceWriteConflictExceptions: true }); db.adminCommand({configureFailPoint: 'WTWriteConflictException' , mode: {activationProbability: 0.05}}); db.adminCommand({applyOps: ops})
    • Repl 2017-07-31

      This bug affects 3.4 and although some code has changed in master, I believe it still exists in the same way. This ticket will reference the 3.4.4 code.

      The bug is in the contract between a call from _applyOps to applyOperation_inlock for a non-upserting insert. If applyOperation_inlock is wrong, the return for other operations may need to be audited.

      Consider a WriteConflictException (WCE) thrown in the collection->insert call here: https://github.com/mongodb/mongo/blob/r3.4.4/src/mongo/db/repl/oplog.cpp#L846-L861

      In this case applyOperation_inlock is converting the WCE (which extends from DBException) into a Status with the appropriate code of 112 (WCE).

      However the caller from _applyOps is anticipating WCE's to be thrown as it's wrapped in a WCE retry loop:
      https://github.com/mongodb/mongo/blob/r3.4.4/src/mongo/db/catalog/apply_ops.cpp#L167-L177

      The status with error code 112 increments the errors counter:
      https://github.com/mongodb/mongo/blob/r3.4.4/src/mongo/db/catalog/apply_ops.cpp#L194-L197

      Which results in the applyOps command returning ErrorCode: 8, UnknownError:
      https://github.com/mongodb/mongo/blob/r3.4.4/src/mongo/db/catalog/apply_ops.cpp#L252-L254

      As far as I know, engaging the retry loop in _applyOps would be correct behavior.

      For interested parties, WCE's can (infrequently) be thrown on reads/writes other than concurrent access to the same document. These causes are usually memory pressure related.

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            17 Start watching this issue

              Created:
              Updated:
              Resolved: