Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38297

Killing session on a secondary currently applying prepare oplog entry can fassert

    • Fully Compatible
    • ALL
    • Sharding 2018-12-17, Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25

      When secondaries apply the oplog entry that prepares a transaction, they check out the corresponding MongoD session. If the session is killed while the the entry is being applied, the operation context applying may be interrupted, leading to an fassert here in SyncTail.

      Example failure: 

      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.379-0500 D TXN      [repl writer worker 11] New transaction started with txnNumber: 165 on session with lsid 0bcc3f79-2f23-40e5-933f-9832f261d827
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.386-0500 I COMMAND  [thread23] killing op: 2587 as part of killing session: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [repl writer worker 11] writer worker caught exception: Interrupted: operation was interrupted on: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [rsSync-0] Failed to apply batch of operations. Number of operations in batch: 1. First operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Last operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Oplog application failed in writer thread 9: Interrupted: operation was interrupted
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0] Fatal assertion 34437 Interrupted: operation was interrupted at src/mongo/db/repl/sync_tail.cpp 878
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0]
      [ShardedClusterFixture:job0:shard1:secondary1] 
      [ShardedClusterFixture:job0:shard1:secondary1] ***aborting after fassert() failure
      [ShardedClusterFixture:job0:shard1:secondary1]
      

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: