-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Replication, Sharding, Testing Infrastructure
-
Cluster Scalability
We have a lot of stepdown, kill, and terminate suites that ensure there is only ever 0 or 1 primary at any given time. We have no passthrough suites that allow more than 1 primary at a time. The goal of this ticket is to fill this gap in our test coverage. A targeted test for split-brain scenarios will be added in SERVER-38133.
We will develop the following new suites. They will all work similarly but in our various different types of tests (concurrency, fuzzer, jscore, sharding/). Rather than step-down nodes, which tests interruptibility well, we will step up nodes. We will keep the election timeout at 24 hours, step up a node, and then after a few seconds the original primary will step down when it sees a higher term. We may want to adjust the heartbeat interval to tune the length of time with 2 primaries.
Additionally, we will use mongobridge and the failCommand failpoint to close connections and drop arbitrary network messages with a low probability. Mongobridge is a more obvious way of doing this, but is not hooked up to the resmoke python test fixtures currently. The failCommand failpoint would need to be extended to be able to drop arbitrary network messages entirely either before or after the command is run, or close connections both before or after a command is run to fully test this.
This will only exist in sharded clusters because mongos does not use an electionId to prevent talking to older primaries. We will not want to drop network messages completely that are intended for the client, only between nodes and to mongos. But we can close connections or return network errors to the client.
We should be able to use our retryable writes infrastructure to make this all happen, but we will need to increase our number of retries so we have an exceedingly low probability to zero of never retrying against a most recent primary that can actually commit a majority write. This will make the test suite run slower, and when the suite fails this number of retries will need to be exhausted before the test will actually fail, so it can't be made arbitrarily high.
Ideally this would run with sharded collections and the balancer on to add testing around inter-node commands in sharded clusters (such as moveChunk) and make sure they are using w:majority, and retrying correctly. In the future when sharding uses transactions internally this will be incredibly valuable testing.
- related to
-
SERVER-40586 step up instead of stepping down in stepdown suites
- Closed
-
SERVER-34943 failCommand failpoint should ignore commands from replica set members
- Closed