Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Fixed
-
None
-
None
-
None
-
Fully Compatible
-
ALL
-
Repl 2022-04-04, Repl 2022-04-18, Repl 2022-05-02
-
44
Description
As part of downgrading the cluster, we stop the config server mongod. Part of the process includes stepping down the node before collection validation. However, there is a concurrent election that happens during stepdown. This causes the killOp thread to kill the stepDown with InterrupedDueToReplStateChange:
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I ELECTION 21450 [ReplCoord-9] "Election succeeded, assuming primary role","attr":
|
|
{"term":2}
|
|
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I REPL 21358 [ReplCoord-9] "Replica set state transition","attr":
|
|
{"newState":"PRIMARY","oldState":"SECONDARY"}
|
|
...
|
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.456+00:00 I COMMAND 21579 [conn94] "Attempting to step down in response to replSetStepDown command"
|
...
|
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.487+00:00 I REPL 21343 [RstlKillOpThread] "Starting to kill user operations"
|
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21344 [RstlKillOpThread] "Stopped killing user operations"
|
[js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21340 [RstlKillOpThread] "State transition ops metrics","attr":\\{"metrics":{"lastStateTransition":"stepUp","userOpsKilled":1,"userOpsRunning":4}}
|
One way to fix this is to either set the cluster secondaries to votes: 0 since we don't expect to test election behavior in this test. An alternative is to add InterruptedDueToReplStateChange https://github.com/10gen/mongo/blob/1cc143da4077560d714d99471b8006c0dec5f66a/jstests/libs/override_methods/validate_collections_on_shutdown.js#L87 of validate_collections_in_stepdown.js