Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21956

applyOps does not correctly propagate operation cancellation exceptions

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.3, 3.3.0
    • Component/s: Sharding
    • Labels:
    • Backwards Compatibility:
      Minor Change
    • Operating System:
      ALL
    • Backport Completed:

      Description

      This was discovered while running the sharding suite with continuous primary stepdown thread enabled. The applyOps command uses DBDirectClient and for this reason if stepdown happens just at the time the operation is about to start and the threads is interrupted, DBDirectClient will end up returning error 13106 instead of interruption.

      Here are some excerpts from the verbose logs:

      [js_test:balance_repl] 2015-12-18T18:02:13.249+0000 c20514| 2015-12-18T18:02:12.936+0000 D -        [conn32] User Assertion: 11601:operation was interrupted
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32] assertion 11601 operation was interrupted ns:config.chunks query:{ query: { ns: "test.foo" }, orderby: { lastmod: -1 } }
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32]  ntoskip:0 ntoreturn:1
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32] query config.chunks query: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 exception: operation was interrupted code:11601 numYields:0 reslen:71 locks:{ Global: { acquireCount: { r: 3, W: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms
      [js_test:balance_repl] 2015-12-18T18:02:13.251+0000 c20514| 2015-12-18T18:02:12.936+0000 D -        [conn32] User Assertion: 13106:nextSafe(): { $err: "operation was interrupted", code: 11601 }
      [js_test:balance_repl] 2015-12-18T18:02:13.252+0000 c20514| 2015-12-18T18:02:12.936+0000 D COMMAND  [conn32] assertion while executing command 'applyOps' on database 'config' with arguments '{ applyOps: [ { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_600.0", lastmod: Timestamp 1000|15, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 600.0 }, max: { _id: 700.0 }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_600.0" } }, { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_700.0", lastmod: Timestamp 1000|16, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 700.0 }, max: { _id: MaxKey }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_700.0" } } ], preCondition: [ { ns: "config.chunks", q: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } }, res: { lastmod: Timestamp 1000|14 } } ], maxTimeMS: 30000 }' and metadata '{ $replData: 1 }': 13106 nextSafe(): { $err: "operation was interrupted", code: 11601 }
      

      Putting this ticket in the sharding bucket, because sharding is the main consumer of applyOps.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: