Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38297

Killing session on a secondary currently applying prepare oplog entry can fassert

    XMLWordPrintable

Details

    • Fully Compatible
    • ALL
    • Sharding 2018-12-17, Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25

    Description

      When secondaries apply the oplog entry that prepares a transaction, they check out the corresponding MongoD session. If the session is killed while the the entry is being applied, the operation context applying may be interrupted, leading to an fassert here in SyncTail.

      Example failure: 

      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.379-0500 D TXN      [repl writer worker 11] New transaction started with txnNumber: 165 on session with lsid 0bcc3f79-2f23-40e5-933f-9832f261d827
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.386-0500 I COMMAND  [thread23] killing op: 2587 as part of killing session: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [repl writer worker 11] writer worker caught exception: Interrupted: operation was interrupted on: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [rsSync-0] Failed to apply batch of operations. Number of operations in batch: 1. First operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Last operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Oplog application failed in writer thread 9: Interrupted: operation was interrupted
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0] Fatal assertion 34437 Interrupted: operation was interrupted at src/mongo/db/repl/sync_tail.cpp 878
      [ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0]
      [ShardedClusterFixture:job0:shard1:secondary1] 
      [ShardedClusterFixture:job0:shard1:secondary1] ***aborting after fassert() failure
      [ShardedClusterFixture:job0:shard1:secondary1]
      

      Attachments

        Issue Links

          Activity

            People

              kaloian.manassiev@mongodb.com Kaloian Manassiev
              jack.mulrow@mongodb.com Jack Mulrow
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: