Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21956

applyOps does not correctly propagate operation cancellation exceptions

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.2.3, 3.3.0
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
    • Minor Change
    • ALL

      This was discovered while running the sharding suite with continuous primary stepdown thread enabled. The applyOps command uses DBDirectClient and for this reason if stepdown happens just at the time the operation is about to start and the threads is interrupted, DBDirectClient will end up returning error 13106 instead of interruption.

      Here are some excerpts from the verbose logs:

      [js_test:balance_repl] 2015-12-18T18:02:13.249+0000 c20514| 2015-12-18T18:02:12.936+0000 D -        [conn32] User Assertion: 11601:operation was interrupted
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32] assertion 11601 operation was interrupted ns:config.chunks query:{ query: { ns: "test.foo" }, orderby: { lastmod: -1 } }
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32]  ntoskip:0 ntoreturn:1
      [js_test:balance_repl] 2015-12-18T18:02:13.250+0000 c20514| 2015-12-18T18:02:12.936+0000 I QUERY    [conn32] query config.chunks query: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 exception: operation was interrupted code:11601 numYields:0 reslen:71 locks:{ Global: { acquireCount: { r: 3, W: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms
      [js_test:balance_repl] 2015-12-18T18:02:13.251+0000 c20514| 2015-12-18T18:02:12.936+0000 D -        [conn32] User Assertion: 13106:nextSafe(): { $err: "operation was interrupted", code: 11601 }
      [js_test:balance_repl] 2015-12-18T18:02:13.252+0000 c20514| 2015-12-18T18:02:12.936+0000 D COMMAND  [conn32] assertion while executing command 'applyOps' on database 'config' with arguments '{ applyOps: [ { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_600.0", lastmod: Timestamp 1000|15, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 600.0 }, max: { _id: 700.0 }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_600.0" } }, { op: "u", b: true, ns: "config.chunks", o: { _id: "test.foo-_id_700.0", lastmod: Timestamp 1000|16, lastmodEpoch: ObjectId('56744a23fc2e02a76c6d8248'), ns: "test.foo", min: { _id: 700.0 }, max: { _id: MaxKey }, shard: "test-rs0" }, o2: { _id: "test.foo-_id_700.0" } } ], preCondition: [ { ns: "config.chunks", q: { query: { ns: "test.foo" }, orderby: { lastmod: -1 } }, res: { lastmod: Timestamp 1000|14 } } ], maxTimeMS: 30000 }' and metadata '{ $replData: 1 }': 13106 nextSafe(): { $err: "operation was interrupted", code: 11601 }
      

      Putting this ticket in the sharding bucket, because sharding is the main consumer of applyOps.

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: