[SERVER-36305] Add KillSessions stage to transactions FSM workloads Created: 26/Jul/18  Updated: 29/Oct/23  Resolved: 21/Aug/18

Status: Closed
Project: Core Server
Component/s: Replication, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.1.3

Type: Task Priority: Major - P3
Reporter: Judah Schvimer Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: prepare_testing
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-08-13, Sharding 2018-08-27
Participants:

 Comments   
Comment by Githook User [ 21/Aug/18 ]

Author:

{'name': 'jannaerin', 'email': 'golden.janna@gmail.com', 'username': 'jannaerin'}

Message: SERVER-36305 Add KillSessions stage to transactions FSM workloads
Branch: master
https://github.com/mongodb/mongo/commit/49109314652b5861d3a18a7f679bb39a9bebbc0c

Comment by Max Hirschhorn [ 02/Aug/18 ]

Per an in-person discussion with judah.schvimer, we'd eventually like to have coverage around a prepared transaction being aborted (e.g. as a result of the session being killed). An intermediate step along the way is to take one of the existing transactions FSM workloads and create a new version of it that periodically kills an active session. The following describes a new multi_statement_transaction_kill_sessions.js FSM workload:

1. Extend the multi_statement_transaction_atomicity_isolation.js FSM workload to have a killSession() state function that first runs the refreshLogicalSessionCacheNow testing-only command, then runs the

db.getSiblingDB("config").system.sessions.aggregate([
    {$listSessions: {}},
    {$sample: {size: 1}},
]);

pipeline to get an arbitrary active session, and then runs the killSessions command with that logical session id. Alternatively, we could just run the killAllSessions command depending on how large of a hammer we want to use. The transition table should be updated so that ~1 thread is running the killSessions() state function at a time - where the other worker threads would be running the update() and checkConsistency() state functions.

Note: Due to how this FSM workload would kill an active session on the server that's in use by a different FSM workload, which wouldn't be equipped to handle the server's error response from the operation being killed, the multi_statement_transaction_kill_sessions.js FSM workload shouldn't be run in the concurrency_simultaneous*.yml test suites.

2. Change the withTxnAndAutoRetry() function defined in jstests/concurrency/fsm_workload_helpers/auto_retry_transaction.js to automatically retry the entire transaction (by calling func() again) on the server's error response for when the session is killed. We'll likely only want to enable this mode when running the multi_statement_transaction_kill_sessions.js FSM workload, so the withTxnAndAutoRetry() function should probably take an additional parameter for it. The multi_statement_transaction_atomicity_isolation.js FSM workload could then specify this.autoRetryOptions to the withTxnAndAutoRetry() function (which would be undefined normally) and the multi_statement_transaction_kill_sessions.js FSM workload would set $config.data.autoRetryOptions to be something else.

Generated at Thu Feb 08 04:42:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.