[SERVER-41625] RollbackTest fixture runs data consistency checks twice when it stops Created: 10/Jun/19  Updated: 29/Oct/23  Resolved: 27/Jan/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.3.4

Type: Improvement Priority: Minor - P4
Reporter: William Schultz (Inactive) Assignee: Ryan Timmons
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-25640 Have ReplSetTest run checkDBHashes() ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2020-01-27, Repl 2020-02-10
Participants:

 Description   

The RollbackTest fixture runs replica set data consistency checks explicitly inside its stop() method. We subsequently call stopSet on the underlying ReplSetTest used by the RollbackTest. Since the completion of SERVER-25640, we will run data consistency checks by default inside stopSet. This causes us to run data consistency checks twice when stopping an instance of RollbackTest. These extra checks are redundant and can slow down test execution. We can likely remove the extra data consistency check in RollbackTest.stop() and rely on the checks done by ReplSetStep.stopSet().



 Comments   
Comment by Githook User [ 27/Jan/20 ]

Author:

{'email': 'ryan.timmons@mongodb.com', 'username': 'rtimmons', 'name': 'Ryan Timmons'}

Message: SERVER-41625 RollbackTest fixture runs data consistency checks twice when it stops
Branch: master
https://github.com/mongodb/mongo/commit/3edb43edb035799a599bde09acd4006715987e5a

Comment by William Schultz (Inactive) [ 14/Jan/20 ]

ryan.timmons In a standard usage of RollbackTest, where we go through all the normal state transitions, we will do consistency checks 3 times: (1) when we call transitionToSteadyStateOperations(), (2) when we call RollbackTest.stop, and (3) when we call ReplSetTest.stopSet from inside RollbackTest.stop. Ideally, we want to only have 1 set of consistency checks. I think that the consistency checks inside 'ReplSetTest.stopSet' are always redundant, since we directly call 'checkDataConsistency' immediately before. So, we can always skip the stopSet consistency checks. You can do this by setting jsTest.options().skipCheckDBHashes, but it might be simpler to just add another argument to the stopSet function that allows you to bypass consistency checks, so that we don't have to change those global settings for a single function call.

To combine checks (1) and (2), I think it would make sense to record whether we already checked data consistency inside 'transitionToSteadyStateOperations()'. We should always call RollbackTest.stop when ending a test, but, as you pointed out, we may not always call transitionToSteadyStateOperations(), and even if we do, we may skip consistency checks there. Once we reach RollbackTest.stop, we can see if consistency checks were already executed previously, inside transitionToSteadyStateOperations(). If they were, then we don't need to do any consistency checks. If they weren't, then we can do one set of consistency checks.

To summarize, we would always skip the ReplSetTest.stopSet consistency checks, and only do consistency checks inside RollbackTest.stop if we didn't already do them inside transitionToSteadyStateOperations(). Also note that the 'checkDataConsistency' function is slightly more thorough than the consistency checks inside ReplSetTest.stopSet, since we also explicitly check collection counts there.

Comment by William Schultz (Inactive) [ 11/Jun/19 ]

To amend the original description, I believe that we actually do 3 data consistency checks in tests that use the RollbackTest in the standard way. This is because there is another call to RollbackTest.checkDataConsistency() inside transitionToSteadyStateOperations, which is always called when going through the normal rollback test sequence.

Comment by William Schultz (Inactive) [ 11/Jun/19 ]

Note that the RollbackTest.checkDataConsistency appears to do one more check than ReplSetTest.stopSet, which is validateNodes. ReplSetTest only calls checkOplogs and checkReplicatedDataHashes. We may want to remove the extra calls to checkOplogs and checkReplicatedDataHashes while leaving the call to validateNodes present in the RollbackTest consistency checks.

Comment by William Schultz (Inactive) [ 10/Jun/19 ]

As one example of the slowdown incurred by these extra checks, running the rollback_rename_collection_on_sync_source.js test with the redundant checks in place took 58.75 seconds on my local machine, and 46.83 seconds after removing the redundant check.

Generated at Thu Feb 08 04:58:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.