Details
Description
Due to the issue described in SERVER-31398 with the movePrimary command getting interrupted, the mongos_rs_shard_failure_tolerance.js test may fail in the sharding_continuous_config_stepdown.yml test suite. The test assumes that collUnsharded lives on shard #0 after it terminates the primaries of shards #1 and #2; however, it doesn't actually call assert.commandWorked() and instead just logs the server's response.
// Create the unsharded database
|
assert.writeOK(collUnsharded.insert({some: "doc"})); |
assert.writeOK(collUnsharded.remove({}));
|
printjson(
|
admin.runCommand({movePrimary: collUnsharded.getDB().toString(), to: st.shard0.shardName}));
|
|
// Create the sharded database
|
assert.commandWorked(admin.runCommand({enableSharding: collSharded.getDB().toString()}));
|
printjson(
|
admin.runCommand({movePrimary: collSharded.getDB().toString(), to: st.shard0.shardName}));
|
assert.commandWorked(
|
admin.runCommand({shardCollection: collSharded.toString(), key: {_id: 1}}));
|
assert.commandWorked(admin.runCommand({split: collSharded.toString(), middle: {_id: 0}}));
|
assert.commandWorked(admin.runCommand(
|
{moveChunk: collSharded.toString(), find: {_id: 0}, to: st.shard1.shardName}));
|
This makes the failure mode much less obvious as to its root cause as it simply manifests as mongos repeatedly trying to read from one of the other (downed) shard.