[SERVER-5896] Server removed from replica set in sharded cluster is not fully removed Created: 22/May/12  Updated: 15/Aug/12  Resolved: 27/Jul/12

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: 2.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tad Marshall Assignee: Randolph Tan
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Sharded Linux cluster:
3 config servers
3 shards, each (initially) with three nodes per shard (reconfigured to two nodes and an arbiter)
1 mongos


Issue Links:
Depends
Related
Operating System: Linux
Participants:

 Description   

A sharded cluster was reconfigured to replace a secondary on each of the three shards with an arbiter. This appeared to work based on command responses and acknowledgements in the logs, but the mongos continued to try to add the removed secondary back in, even after being restarted more than once. The mongos log showed a loop of erasing and then re-adding the removed node. The cluster did not become usable again until the removed nodes were added back to the cluster. Restarting the config servers did not help. We did not try restarting the individual nodes.



 Comments   
Comment by Randolph Tan [ 27/Jul/12 ]

The local.system.replset probably has been manually modified. This is 100% reproducible if you directly update the config doc and the only trigger it needs is a restart. This is by design and the right way to change the configuration is to use the replSetReconfig, which includes sanity checks for errors.

Generated at Thu Feb 08 03:10:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.