[SERVER-51598] Add new test suites that test transaction expiration logic Created: 14/Oct/20  Updated: 29/Oct/23  Resolved: 12/Nov/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.0.18
Fix Version/s: 4.0.22

Type: Bug Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Jason Chan
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-46238 Race between commitTransaction and tr... Closed
is related to SERVER-41020 Tweak or fuzz storage engine tunable ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2020-11-02, Repl 2020-11-16
Participants:
Case:

 Description   

After the fix inĀ SERVER-46238, we have continued to see crashes due to transaction expiration running concurrently with a transaction on 4.0. We should diagnose and fix this issue on 4.0. After this has been fixed, we should consider follow-up work to add randomized test coverage for low transactionLifetimeLimitSeconds on all branches.



 Comments   
Comment by Jason Chan [ 12/Nov/20 ]

We added two new test suites to v4.0. They are the concurrency_replication_abort_multi_stmt_txn and replica_sets_abort_multi_stmt_txn suites. They use the txn_override_passthrough logic to wrap supported operations in multi-statement transactions.

These two suites both set the newly added setTransactionLifetimeToRandomMillis and increaseFrequencyOfPeriodicThreadToExpireTransactions failpoints to set the transactionLifetimeLimitSeconds to a value between 0-20ms and to also have the periodic thread to run every 5ms to check for transaction expiration. All assertion errors are ignored in these suites as the intent is for them to show up as a failure in evergreen only if the system crashes, or if there's a hang or data consistency issue.

Comment by Githook User [ 12/Nov/20 ]

Author:

{'name': 'Jason Chan', 'email': 'jason.chan@10gen.com', 'username': 'jasonjhchan'}

Message: SERVER-51598 Add new abort_multi_stmt_txn_test suites
Branch: v4.0
https://github.com/mongodb/mongo/commit/e399bf8689f592129c9655933bdb6a0e551a47b8

Comment by Tess Avitabile (Inactive) [ 15/Oct/20 ]

I decided to change this ticket to be about diagnosing and fixing the bug on 4.0, instead of adding test coverage for transactionLifetimeLimitSeconds. After the bug is fixed, we should consider adding more test coverage on all branches.

Comment by Louis Williams [ 15/Oct/20 ]

SERVER-41020 should introduce more coverage for a range of values, including 'transactionLifetimeLimitSeconds'. What I don't know, however, is whether we will be able to easily backport that test coverage back to 4.0.

Comment by Daniel Gottlieb (Inactive) [ 15/Oct/20 ]

louis.williams I see SERVER-41020 is in progress which does have a comment that mentions fuzzing some non-storage related parameters (including transaction lifetime limits). Is the patch you have in mind for that going to lend to this ticket? Or does it make more sense for whoever picks this ticket up to do independent work?

Generated at Thu Feb 08 05:25:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.