[SERVER-5896] Server removed from replica set in sharded cluster is not fully removed Created: 22/May/12 Updated: 15/Aug/12 Resolved: 27/Jul/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | 2.0.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tad Marshall | Assignee: | Randolph Tan |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Sharded Linux cluster: |
||
| Issue Links: |
|
||||||||
| Operating System: | Linux | ||||||||
| Participants: | |||||||||
| Description |
|
A sharded cluster was reconfigured to replace a secondary on each of the three shards with an arbiter. This appeared to work based on command responses and acknowledgements in the logs, but the mongos continued to try to add the removed secondary back in, even after being restarted more than once. The mongos log showed a loop of erasing and then re-adding the removed node. The cluster did not become usable again until the removed nodes were added back to the cluster. Restarting the config servers did not help. We did not try restarting the individual nodes. |
| Comments |
| Comment by Randolph Tan [ 27/Jul/12 ] |
|
The local.system.replset probably has been manually modified. This is 100% reproducible if you directly update the config doc and the only trigger it needs is a restart. This is by design and the right way to change the configuration is to use the replSetReconfig, which includes sanity checks for errors. |