[SERVER-38297] Killing session on a secondary currently applying prepare oplog entry can fassert Created: 28/Nov/18  Updated: 29/Oct/23  Resolved: 14/Mar/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.1.9

Type: Bug Priority: Major - P3
Reporter: Jack Mulrow Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: open_todo_in_code, prepare_errors, todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-37348 TransactionReaper and periodic transa... Closed
Related
is related to SERVER-40186 The logic in `auto_retry_transaction.... Closed
is related to SERVER-36485 ‘killSessions’ (for one session) and ... Closed
is related to SERVER-36486 Return "PreparedTransactionInProgress... Closed
is related to SERVER-37348 TransactionReaper and periodic transa... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2018-12-17, Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25
Participants:

 Description   

When secondaries apply the oplog entry that prepares a transaction, they check out the corresponding MongoD session. If the session is killed while the the entry is being applied, the operation context applying may be interrupted, leading to an fassert here in SyncTail.

Example failure: 

[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.379-0500 D TXN      [repl writer worker 11] New transaction started with txnNumber: 165 on session with lsid 0bcc3f79-2f23-40e5-933f-9832f261d827
[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.386-0500 I COMMAND  [thread23] killing op: 2587 as part of killing session: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }
[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [repl writer worker 11] writer worker caught exception: Interrupted: operation was interrupted on: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }
[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F REPL     [rsSync-0] Failed to apply batch of operations. Number of operations in batch: 1. First operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Last operation: { lsid: { id: UUID("0bcc3f79-2f23-40e5-933f-9832f261d827"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, txnNumber: 165, op: "c", ns: "admin.$cmd", o: { applyOps: [ { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.34: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.34: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 5.0 } }, { op: "u", ns: "test0_fsmdb0.fsmcoll0", ui: UUID("94404109-1c63-4cb8-85b0-bcf3dc1757ef"), o: { $v: 1, $set: { metadata.46: { dbName: "test0_fsmdb0", collName: "fsmcoll0", txnNum: 165.0 }, order.46: { tid: 6.0, iteration: 16.0, numUpdated: 3.0 } } }, o2: { _id: 1.0 } } ], prepare: true }, ts: Timestamp(1543429938, 13), t: 1, h: 5752351775086301644, v: 2, wall: new Date(1543429938357), stmtId: 0, prevOpTime: { ts: Timestamp(0, 0), t: -1 }, prepare: true }. Oplog application failed in writer thread 9: Interrupted: operation was interrupted
[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0] Fatal assertion 34437 Interrupted: operation was interrupted at src/mongo/db/repl/sync_tail.cpp 878
[ShardedClusterFixture:job0:shard1:secondary1] 2018-11-28T13:32:18.387-0500 F -        [rsSync-0]
[ShardedClusterFixture:job0:shard1:secondary1] 
[ShardedClusterFixture:job0:shard1:secondary1] ***aborting after fassert() failure
[ShardedClusterFixture:job0:shard1:secondary1]



 Comments   
Comment by Kaloian Manassiev [ 03/May/19 ]

Yes, not sure how I managed to tag it as 4.0.7, but it was a mistake. Thanks for pointing it out!

Comment by Maria van Keulen [ 03/May/19 ]

kaloian.manassiev I see the fix version for this ticket is 4.0.7 but the commits were tagged as r4.1.9. Should this fixVersion be changed to 4.1.9?

Comment by Kaloian Manassiev [ 14/Mar/19 ]

The only test, which remains disabled now is `multi_statement_transaction_kill_sessions_atomicity_isolation.js`, which is failing for unrelated reasons and I am going to track it under SERVER-40186.

Comment by Githook User [ 13/Mar/19 ]

Author:

{'name': 'Kaloian Manassiev', 'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com'}

Message: SERVER-38297 Reenable FSM transactions tests with killOp
Branch: master
https://github.com/mongodb/mongo/commit/ac2e9336b62fc726b29aa624b8ac05bc045523e5

Comment by Githook User [ 13/Mar/19 ]

Author:

{'name': 'Kaloian Manassiev', 'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com'}

Message: SERVER-38297 Cleanup SyncTail and OplogApplierImpl
Branch: master
https://github.com/mongodb/mongo/commit/1e619f3eaf5615d582f6fd88c254e9fcb9c73b2d

Comment by Kaloian Manassiev [ 07/Mar/19 ]

Actually, Judah already pointed me to the change done under SERVER-37348, which already did this, but at a higher level, so I think there might actually not be anything to do for this ticket other than re-enable the tests.

To your question about why allowing killSession on secondaries - why would that be disallowed? There could be secondary reads under a session and these need to be killed somehow. In terms of who should own a session, the answer is that there should always be a single owner of a session at a time on a replica set. The model where secondary application checks-out the session doesn't exactly jive with that model, but since we don't allow transactions on secondaries, this probably doesn't matter right now.

Comment by Siyuan Zhou [ 07/Mar/19 ]

The plan makes sense to me. Shutdown came to my mind initially, but I don't think we use InterruptedDueToStepDown to shutdown secondary application. We share the opCtx for all operations on the same writer in a batch. Currently, any command that checks out a session runs in its own batch, so this isn't a problem. But that makes feel we need to disable interruptions for all secondary application.

A behavioral question: why do we allow killSession on secondaries? I remember mongos will send killSession to all nodes in a replset, but it's always unclear to me who should own a session. Session is used by transaction and a transaction's life cycle is owned by primary, so it sounds like session should be owned by primary too. Session is also used by secondary read for some reason I don't know so it sounds a local state of a secondary on the other hand.

Comment by Kaloian Manassiev [ 07/Mar/19 ]

From looking at the chain of calls that leads to _applyPrepareTransaction, I see that as implemented currently, the applier (SyncTail) never expects to be interrupted (or for any exception to escape while applying oplog).

Since making it exception-safe is going to be a lot of work and given the fact that it can't be interrupted anyways, I propose that we fix this by making any sessions checked-out through MongoDOperationContextSessionWithoutRefresh uninterruptible by making the entire applyCommand_inlock call uninterruptible, if called with OplogApplication::Mode != kApplyOpsCmd. Alternatively I could make only this invocation uninterruptible.

This will take care of all code paths where MongoDOperationContextSessionWithoutRefresh is used without impacting interruptability of the applyOps command itself.

siyuan.zhou, tess.avitabile, what do you think?

Comment by Judah Schvimer [ 06/Mar/19 ]

As part of this ticket, please unblacklist tests marked with a TODO against this ticket.

Comment by Judah Schvimer [ 03/Dec/18 ]

Open question: what happens currently if you call "killOp" on a secondary oplog applier thread?

Comment by Kaloian Manassiev [ 30/Nov/18 ]

Is the problem here that killSession can come concurrently with a session which is currently being operated on by multiApply?

The sessions refactor work will not help with this. If the applier code cannot handle interruptions it should mark the the operation context uninterruptible.

It is a separate problem if we want to avoid interrupting sessions with prepared transactions on them as described in SERVER-36485 - in this case we will have to implement the markUnkillable concept, which would make the session kill attempts throw a "PreparedTransaction" exception. Up till now I didn't think this was necessary, because the code which uses it could just make itself uninterruptible, but I agree that it is a better user experience if we return the proper error?

Let's talk about this on Monday when I will be in the office.

Comment by Gregory McKeon (Inactive) [ 29/Nov/18 ]

judah.schvimer, I pinged kaloian.manassiev to talk to you tomorrow about his session refactor work and whether this will help wrap up these sorts of issues.

Comment by Judah Schvimer [ 28/Nov/18 ]

This may be solved by some combination of SERVER-36485, SERVER-36486, and SERVER-37348.

Generated at Thu Feb 08 04:48:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.