[SERVER-35452] Build testing infrastructure to default to majority writeConcern either for all transaction ops or drops done in the same JS file Created: 06/Jun/18  Updated: 27/Oct/23  Resolved: 28/Jan/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Xiangyu Yao (Inactive)
Resolution: Gone away Votes: 0
Labels: nyc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-35919 Ensure all tests that "use transactio... Closed
is related to SERVER-38670 enable KV drop-pending ident support Closed
Sprint: Storage NYC 2019-02-11
Participants:
Linked BF Score: 5

 Description   

Problem: two-phase drop takes a strong lock asynchronously and can cause transaction operations, which have 0-second lock request times, to fail randomly. Desired behavior: add majority writeConcern to such drop cmds to ensure the second phase finishes before testing proceeds; or the transaction op subsequent to the drop cmd needs majority write concern to achieve the same effect.

There needs to be some generic testing solution to two-phase drop interfering with transaction operations. It seems unlikely that every test writer will know to add majority writeConcern to drop cmds done in the proximity of transaction operations.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 28/Jan/19 ]

I think this can be resolved when the replication two-phase drop logic is removed, eliminating the asynchronous X lock acquisition in v4.2 that can deadlock with transactions.

This ticket was intended to prevent existing and new tests from running into the deadlock that could occur in tests with concurrent transactions and dropCollection commands.

Looking at the linked BF, the associated failures haven't reoccurred since June 2018, so maybe the existing and new tests have been stabilized. Actually, looks like a later filed ticket, SERVER-35919, did this ticket's work. Closing.

Comment by Benety Goh [ 28/Jan/19 ]

Starting in 4.1.7 / SERVER-38670, two phase collection drops no longer take strong locks. Is the work described in this ticket still relevant?

Comment by Max Hirschhorn [ 02/Jul/18 ]

It isn't clear to me though how the testing infrastructure could know when the file is necessary as it depends on the test or test suite indicating that transactions are going to be used.

I had a new thought of how we could get the override file approach to work after judah.schvimer stopped by in the #server-tig Slack channel asking about SERVER-34349 and ways to generalize it. resmoke.py knows whether a JavaScript test is tagged with "uses_transactions"; however, the JavaScript test itself does not because the information is stored as a comment rather than code. If we made resmoke.py inject the tags of a JavaScript test as a new TestData parameter, then we could have override files make programmatic decisions based on those tags. We wouldn't need to put any logic in the server to use "majority" when test commands are enabled or anything like that, but could instead have the override file do something along the lines of

if (jsTest.options().testTags.includes("uses_transactions")) {
    // We create a copy of 'commandObj' to avoid mutating the parameter the caller specified.
    commandObj = Object.assign({}, commandObj);
    commandObj.writeConcern = Object.assign({}, commandObj.writeConcern);
    commandObj.writeConcern.w = "majority";
}

Comment by Dianna Hohensee (Inactive) [ 11/Jun/18 ]

Another way to fix this might be to just default drop to use majority writeConcern, unless it's otherwise specified. This could probably be done easily via a temporary check of

if (enableTestCommands)

{ use majority write concern }

in the drop command. I think the cluster drop cmd might already use majority writeConcern in self-defense.

Comment by Max Hirschhorn [ 06/Jun/18 ]

The set_read_and_write_concerns.js override file is some prior art that may be a useful reference for this ticket. It isn't clear to me though how the testing infrastructure could know when the file is necessary as it depends on the test or test suite indicating that transactions are going to be used.

Generated at Thu Feb 08 04:39:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.