[SERVER-3159] Re-adding member in replica set fails Created: 27/May/11 Updated: 12/Jul/16 Resolved: 02/Jun/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 1.6.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor - P4 |
| Reporter: | Pieter Ennes | Assignee: | Kristina Chodorow (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Replica Set on Ubuntu 10.4 LTS on Amazon EC2 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | Linux | ||||||||
| Participants: | |||||||||
| Description |
|
Removing and re-adding a member of a replica set seems to work: > rs.remove("m2:27017") { "ok" : 1 }> rs.status(); , , , { "_id" : 4, "name" : "m3:27017", "health" : 1, "state" : 2, "uptime" : 2, "lastHeartbeat" : "Fri May 27 2011 13:56:52 GMT+0000 (UTC)" } ], > rs.add("m2:27017") { "ok" : 1 }> rs.status(); , , , " ], But the logs on the re-added node show this repeating message: Fri May 27 14:03:14 [rs Manager] replSet error unexpected exception in haveNewConfig() : 0 assertion db/repl/rs.cpp:315 |
| Comments |
| Comment by Kristina Chodorow (Inactive) [ 02/Jun/11 ] |
|
Good idea, I've added a section in http://www.mongodb.org/display/DOCS/Adding+a+New+Set+Member |
| Comment by Pieter Ennes [ 02/Jun/11 ] |
|
Ah, that's a good tip, maybe this is worth a note in the Wiki somewhere? (or I overlooked it there) We managed to add back the node after upgrading the cluster form 1.6.5 to 1.8.1. Seems the situation has changed a little in that release, for the better! This can be closed I think, thanks. |
| Comment by Kristina Chodorow (Inactive) [ 01/Jun/11 ] |
|
Perfect, I see what happened. Each member of the set has an _id (0, 1, 2, etc.). The error you're getting is that the _id of the member you're adding has changed. It looks like you removed the member with _id : 0 and then tried to re-add it with _id : 5. This is where the command line helper's abstraction breaks down: you have to make sure that if you removed a replica set member with a certain _id, you add it back with the same _id. So, instead of rs.add("m2:27017"), you'd have to do: rs.add({_id : 0, host : "m2:27017"}) Then you should be able to call remove/add back and forth. |
| Comment by Pieter Ennes [ 29/May/11 ] |
|
Kristina, please find it attached. This may be relevant too: Please note the change of server version between the first run (1.6.5, starting at Fri May 27 13:54:15 in the log) and the second run (1.8.1, starting at Fri May 27 14:11:33), which we did in an attempt to bypass the error. |
| Comment by Kristina Chodorow (Inactive) [ 27/May/11 ] |
|
It's very doubtful |
| Comment by Kristina Chodorow (Inactive) [ 27/May/11 ] |
|
What version are you running and can you paste a bigger chunk of the logs? |
| Comment by Pieter Ennes [ 27/May/11 ] |
|
Forgot to note that the re-added node was started with a clean data directory before issuing the rs.remove/add() sequence on the primary. |
| Comment by Pieter Ennes [ 27/May/11 ] |
|
Doing the same thing again leads to https://jira.mongodb.org/browse/SERVER-2981: > rs.remove("m2:27017") { "ok" : 1 }> rs.add("m2:27017") What happened in the first place? |