[SERVER-36457] mongos_rs_shard_failure_tolerance.js test should assert that the movePrimary command succeeds Created: 04/Aug/18  Updated: 29/Oct/23  Resolved: 10/Dec/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.6.13, 4.1.7, 4.0.10

Type: Improvement Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Kim Tao
Resolution: Fixed Votes: 0
Labels: neweng, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.0, v3.6
Sprint: Sharding 2018-12-17
Participants:
Linked BF Score: 0

 Description   

Due to the issue described in SERVER-31398 with the movePrimary command getting interrupted, the mongos_rs_shard_failure_tolerance.js test may fail in the sharding_continuous_config_stepdown.yml test suite. The test assumes that collUnsharded lives on shard #0 after it terminates the primaries of shards #1 and #2; however, it doesn't actually call assert.commandWorked() and instead just logs the server's response.

// Create the unsharded database
assert.writeOK(collUnsharded.insert({some: "doc"}));
assert.writeOK(collUnsharded.remove({}));
printjson(
    admin.runCommand({movePrimary: collUnsharded.getDB().toString(), to: st.shard0.shardName}));
 
// Create the sharded database
assert.commandWorked(admin.runCommand({enableSharding: collSharded.getDB().toString()}));
printjson(
    admin.runCommand({movePrimary: collSharded.getDB().toString(), to: st.shard0.shardName}));
assert.commandWorked(
    admin.runCommand({shardCollection: collSharded.toString(), key: {_id: 1}}));
assert.commandWorked(admin.runCommand({split: collSharded.toString(), middle: {_id: 0}}));
assert.commandWorked(admin.runCommand(
    {moveChunk: collSharded.toString(), find: {_id: 0}, to: st.shard1.shardName}));

This makes the failure mode much less obvious as to its root cause as it simply manifests as mongos repeatedly trying to read from one of the other (downed) shard.



 Comments   
Comment by Githook User [ 15/Apr/19 ]

Author:

{'name': 'Kim Tao', 'email': 'kimberly.tao@10gen.com'}

Message: SERVER-36457: mongos_rs_shard_failure_tolerance.js should assert that movePrimary command succeeds

(cherry picked from commit 00520c2e0b89483e390ecb25cd3291ca8fa30c0f)
Branch: v3.6
https://github.com/mongodb/mongo/commit/954024b15a08e6e78b076ff91324fb26b8eed733

Comment by Luke Chen [ 11/Apr/19 ]

Fixing up fixversion as this ticket was not included as part of 4.0.9 release.

Comment by Githook User [ 09/Apr/19 ]

Author:

{'email': 'kimberly.tao@10gen.com', 'name': 'Kim Tao'}

Message: SERVER-36457: mongos_rs_shard_failure_tolerance.js shold assert that movePrimary command succeeds

(cherry picked from commit 00520c2e0b89483e390ecb25cd3291ca8fa30c0f)
Branch: v4.0
https://github.com/mongodb/mongo/commit/789a26ba61c4559ca7f28b60f85c04745a4d2440

Comment by Githook User [ 10/Dec/18 ]

Author:

{'name': 'Kim Tao', 'email': 'kimberly.tao@10gen.com'}

Message: SERVER-36457: mongos_rs_shard_failure_tolerance.js shold assert that movePrimary command succeeds
Branch: master
https://github.com/mongodb/mongo/commit/00520c2e0b89483e390ecb25cd3291ca8fa30c0f

Generated at Thu Feb 08 04:43:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.