[SERVER-45094] Add passthrough tests for safe reconfig Created: 12/Dec/19  Updated: 29/Oct/23  Resolved: 17/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.4.0-rc6, 4.7.0

Type: Task Priority: Major - P3
Reporter: Siyuan Zhou Assignee: Pavithra Vetriselvan
Resolution: Fixed Votes: 0
Labels: safe-reconfig-testing
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.4
Sprint: Repl 2020-02-10, Repl 2020-02-24, Repl 2020-03-09, Repl 2020-03-23, Repl 2020-04-06, Repl 2020-04-20
Participants:
Linked BF Score: 25

 Description   

The following passthrough test suites will be added:

  • replica_set_reconfig_jscore_passthrough
  • replica_set_reconfig_stepdown_jscore_passthrough
  • replica_set_reconfig_kill_nodes_jscore_passthrough

We will start a 5-node replset with 3 nodes having vote: 1. The test suites periodically and randomly remove or add a node by changing the node’s vote so that the config has 1-5 voting nodes. replica_set_reconfig_kill_nodes_jscore_passthrough kills or restarts a node randomly to cover unclean shutdown and durability issues. The shutdown hook and reconfig hook will make sure a majority of nodes are alive at any time.

These suites should be run with readConcern: majority, writeConcern: majority, causal consistency, readPreference: primary, and retryable writes so that reconfig does affect the operations.



 Comments   
Comment by Githook User [ 14/May/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 add disabled replica set reconfig passthroughs

SERVER-45094 add retryable read logic to network_error_and_txn_override.js

(cherry picked from commit f59f63db6c37c0d4657b57d559c95d830b0e34c2)

SERVER-45094 add replica_sets_reconfig_jscore_passthrough suite

(cherry picked from commit 4d91fac171cbe3f2af53d9258965399e648a1947)

SERVER-45094 use w:1 writes and remove causal consistency in reconfig passthrough

(cherry picked from commit a43cb23defc6182d08a7814e4731ef98f2d30b6a)

SERVER-45094 add replica_sets_reconfig_jscore_stepdown_passthrough

(cherry picked from commit 81e0ad27c280c02a49beb65ff4473d5dce62b089)

SERVER-45094 add replica_sets_reconfig_kill_primary_jscore_passthrough

(cherry picked from commit 2debab7987b24bf902f9a128654ce928441c29a2)

SERVER-47678 stepdown and kill primary reconfig passthroughs should ignore ReplicaSetMonitorErrors

(cherry picked from commit 91672e58f1169c7edd684b911f20f62b8a71f8d1)

SERVER-47544 always increase election timeout to 24 hours in passthrough suites

(cherry picked from commit 81d53a715f49827a9f2538d4572f9b01f2b12887)
Branch: v4.4
https://github.com/mongodb/mongo/commit/f4528563033d933ca920b3e4b2a5e3344e198a5c

Comment by Pavithra Vetriselvan [ 17/Apr/20 ]

siyuan.zhou For bookkeeping, we will need to backport this ticket along with SERVER-47622, SERVER-47643, and SERVER-47678 to keep 4.4 as stable as possible.

Comment by Githook User [ 17/Apr/20 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-47142 Revert "SERVER-45094 check if node is primary before doing reconfig noop write"

This reverts commit 0e7b476357a12802f1be1ca5c8a4b1a919000ef8.
Branch: master
https://github.com/mongodb/mongo/commit/0f8cb172af4f5b5df89e96097398319c811657b6

Comment by Githook User [ 14/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 add replica_sets_reconfig_kill_primary_jscore_passthrough
Branch: master
https://github.com/mongodb/mongo/commit/2debab7987b24bf902f9a128654ce928441c29a2

Comment by Githook User [ 02/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 add replica_sets_reconfig_jscore_stepdown_passthrough
Branch: master
https://github.com/mongodb/mongo/commit/81e0ad27c280c02a49beb65ff4473d5dce62b089

Comment by Githook User [ 02/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 check if node is primary before doing reconfig noop write
Branch: master
https://github.com/mongodb/mongo/commit/0e7b476357a12802f1be1ca5c8a4b1a919000ef8

Comment by Githook User [ 02/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 replSetReconfig.js should wait for primary before running reconfig
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/e4c962e6c9cdf11e7fb53e728b02a76cb8160ecc

Comment by Githook User [ 01/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: Revert "SERVER-45094 check if node is primary before doing reconfig noop write"

This reverts commit 49836a791fbab2c8f3726450cda1d3c708eff90a.
Branch: master
https://github.com/mongodb/mongo/commit/120555f4df938ff61110a63eeccadf9c3068d24b

Comment by Githook User [ 01/Apr/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 check if node is primary before doing reconfig noop write
Branch: master
https://github.com/mongodb/mongo/commit/49836a791fbab2c8f3726450cda1d3c708eff90a

Comment by Githook User [ 30/Mar/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-45094 use w:1 writes and remove causal consistency in reconfig passthrough
Branch: master
https://github.com/mongodb/mongo/commit/a43cb23defc6182d08a7814e4731ef98f2d30b6a

Comment by Githook User [ 17/Mar/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'username': 'pvselvan', 'email': 'pvselvan@umich.edu'}

Message: SERVER-45094 add replica_sets_reconfig_jscore_passthrough suite
Branch: master
https://github.com/mongodb/mongo/commit/4d91fac171cbe3f2af53d9258965399e648a1947

Comment by Githook User [ 16/Mar/20 ]

Author:

{'name': 'Pavi Vetriselvan', 'username': 'pvselvan', 'email': 'pvselvan@umich.edu'}

Message: SERVER-45094 add retryable read logic to network_error_and_txn_override.js
Branch: master
https://github.com/mongodb/mongo/commit/f59f63db6c37c0d4657b57d559c95d830b0e34c2

Comment by Siyuan Zhou [ 30/Jan/20 ]

Sounds good to me! Nice investigation.

Comment by Pavithra Vetriselvan [ 30/Jan/20 ]

Talked to robert.guo and jason.chan about this. I think the hardest part will be adding the first suite, replica_set_reconfig_jscore_passthrough since that's where we'll want to figure out how the suite decides to reconfig a set of nodes. Stepdown and shutdown hooks already exist and can be adapted to the next two suites.

From my understanding, we would like to start this suite with 5 nodes (all with votes: 1) and then use a background hook to incrementally change the votes of these nodes to 0 through reconfigs. We should only be trying to run reconfigs on sets of voting nodes. siyuan.zhou william.schultz, does that sound correct to you?

Generated at Thu Feb 08 05:07:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.