[SERVER-37103] Add a hook to check for open transactions after every test Created: 12/Sep/18  Updated: 06/Dec/22  Resolved: 20/Sep/19

Status: Closed
Project: Core Server
Component/s: Replication, Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Judah Schvimer Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: ShardedTxn:Testing, prepare_optional, prepare_testing
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-38059 Transactions write conflicts tests sh... Closed
related to SERVER-38060 Don't run after test hooks in resmoke... Closed
Assigned Teams:
Replication
Sprint: Repl 2019-07-01, Repl 2019-07-15
Participants:

 Description   

Passthrough tests run validation hooks after every test. These tests should first have a hook that looks for open transactions, if so marks the test as failed, and then aborts the transactions.

Non-passthrough tests run validation when the cluster is stopped. These tests should run the same hook.

Tests are not allowed to leave transactions open and it is considered a bug, but it should not lead to a test hang.



 Comments   
Comment by Judah Schvimer [ 17/May/19 ]

We want to do this so that if validate stops blocking behind transactions we still get this coverage. We like the fact that it hangs because it gives us hang analyzer output, though that's not required.

Comment by Gregory McKeon (Inactive) [ 17/May/19 ]

After running validate, we should test for open running txns and fail the test if there are any.

Comment by Max Hirschhorn [ 27/Nov/18 ]

I thought of this ticket while discussing SERVER-33589 with samy.lanka. Aborting transactions in order to be able to run the data consistency checks is certainly a noble cause. If we're interested in making these tests fail rather than hang (and there isn't a bug where clean shutdown hangs in the presence of prepared transactions), then it might be reasonable to use currentOp or serverStatus to check if there are prepared transactions and raise a StopExecution exception in the resmoke.py hook. We then wouldn't run other resmoke.py hooks like ValidateCollections or subsequent tests (via any resmoke.py job), and would instead shut the cluster down by sending all of the nodes a SIGTERM.

This wouldn't change the loss of test coverage due to the test suite failing early, but would make the failure symptom more obvious without a lot of extra work.

Comment by Max Hirschhorn [ 18/Sep/18 ]

We should use the discover_topology.js library to have this run on the CSRS and all replica set shards within a sharded cluster. That way there isn't more work to be done when we want to turn this hook on for sharded clusters.

Generated at Thu Feb 08 04:44:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.