Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-65184

Avoid concurrent election and stepdown in downgrade_default_write_concern_majority.js

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 5.0.9
    • None
    • None
    • Fully Compatible
    • ALL
    • Repl 2022-04-04, Repl 2022-04-18, Repl 2022-05-02
    • 44

    Description

      As part of downgrading the cluster, we stop the config server mongod. Part of the process includes stepping down the node before collection validation. However, there is a concurrent election that happens during stepdown. This causes the killOp thread to kill the stepDown with InterrupedDueToReplStateChange:

      [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I ELECTION 21450 [ReplCoord-9] "Election succeeded, assuming primary role","attr":
       
      {"term":2}
       
      [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I REPL 21358 [ReplCoord-9] "Replica set state transition","attr":
       
      {"newState":"PRIMARY","oldState":"SECONDARY"}
       
      ...
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.456+00:00 I COMMAND 21579 [conn94] "Attempting to step down in response to replSetStepDown command"
       ...
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.487+00:00 I REPL 21343 [RstlKillOpThread] "Starting to kill user operations"
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21344 [RstlKillOpThread] "Stopped killing user operations"
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21340 [RstlKillOpThread] "State transition ops metrics","attr":\\{"metrics":{"lastStateTransition":"stepUp","userOpsKilled":1,"userOpsRunning":4}}
      

      One way to fix this is to either set the cluster secondaries to votes: 0 since we don't expect to test election behavior in this test. An alternative is to add InterruptedDueToReplStateChange https://github.com/10gen/mongo/blob/1cc143da4077560d714d99471b8006c0dec5f66a/jstests/libs/override_methods/validate_collections_on_shutdown.js#L87 of validate_collections_in_stepdown.js

      Attachments

        Activity

          People

            jason.chan@mongodb.com Jason Chan
            jason.chan@mongodb.com Jason Chan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: