[SERVER-82457] Test cases in txn_commit_optimizations_for_read_only_shards.js are not isolated from each other Created: 26/Oct/23  Updated: 22/Jan/24

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Randolph Tan Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: cs-subteam1, sharding-nyc, sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-40176 Cursor seekExact should not use WT_CU... Closed
Assigned Teams:
Cluster Scalability
Participants:
Linked BF Score: 5
Story Points: 4

 Description   

Each of the test case touches different documents which in theory should not conflict with each other. However, due to SERVER-40176, it can cause new test cases to hit prepare conflict if transactions from older test cases are still alive. This jstest also sets the coordinateCommitReturnImmediatelyAfterPersistingDecision to true, which allows returning early and if combined with stop replication for certain test cases, can cause the older transaction to stay alive longer than expected. The combination SERVER-40176 and prepare transactions living longer than expected can cause spurious failures in the test.



 Comments   
Comment by Randolph Tan [ 26/Oct/23 ]

Possible directions (assuming we'll never fix SERVER-40176):
1. Turn off early return optimization. I think this is not compatible to all test cases, I tried it and it appears to cause some of the test cases to hang.
2. Wait for the transaction to completely get cleaned up after calling cleanup.

I think doing any of these can potentially cause the test that already takes a long time to complete (about 11 mins in plain sharding suite) to even take longer. Perhaps we should split each failure modes into their own file so they can be parallelized (and also faster turn around for local testing).

Generated at Thu Feb 08 06:49:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.