Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35529

Rollback fuzzer suites do not need to shutdown nodes in states where rollbacks do not occur

    • Server Tooling & Methods
    • 16

      The rollback_fuzzer_unclean_shutdowns/rollback_fuzzer_clean_shutdowns trigger rollbacks via the RollbackTest fixture, but inject clean and unclean node restarts at random points within the test. The RollbackTest fixture models a test execution as a state machine. At each state, we expect each node and the overall replica set topology to be in a particular state, which should (for the most part) not change until we execute the next state transition. There are five states of the RollbackTest:

      • kStopped - test is no longer running
      • kRollbackOps - Old primary is isolated from a secondary. Writes done to it will be rolled back
      • kSyncSourceOpsBeforeRollback - New primary has been elected, can take writes.
      • kSyncSourceOpsDuringRollback - Rollback on old primary should be in progress.
      • kSteadyStateOps - Rollback should have completed and replica set should now be in steady state.

      The goal of running the RollbackFuzzer with node shutdowns was mainly to exercise (1) shutting down a node while it is in the process of rollback or recovery and (2) shutting down a node that is currently being used as a sync source for a node undergoing rollback. These goals can still be achieved without injecting node restarts into states kRollbackOps or kSyncSourceOpsBeforeRollback. The only states we really need to be injecting restarts into are kSyncSourceOpsDuringRollback, since this is the state where rollback may be occurring.

      Removing unnecessary restarts will, ideally, reduce test times, and make debugging easier, since the logs will no longer include the many extra state transitions introduced by replica set node restarts.

      An easy way to implement this would be to make the restartNode function nilpotent when in certain states in RollbackTest. Alternatively, we could update the fuzzer to only add restart commands at certain points in the test.

            Assignee:
            backlog-server-stm Backlog - Server Tooling and Methods (STM) (Inactive)
            Reporter:
            william.schultz@mongodb.com Will Schultz
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: