[SERVER-2694] Replication Sets ending up with all secondaries... and no primary Created: 07/Mar/11 Updated: 30/Mar/12 Resolved: 07/Mar/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Admin, Replication, Usability |
| Affects Version/s: | 1.8.0-rc0, 1.8.0-rc1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Peter Colclough | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 10 64 bit, 8gig memory... too much disk to worry about |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
Firstly...a s a new user... brilliant package.... thanks. (And stupidly I posted this on the Ubuntu/mongo log as well... sorry... monday morning syndrome) Now.. I have 6 instances in a replication set, spread over 2 physical machines. All works fine. If I then take down one of the machines, I end up with 3 instances, all being secondaries. This is a basic setup with default voting rights, and no arbiter. mycache:SECONDARY> rs.status() , , , , , , 1. I tried reconfig, but that needs a primary, which I don't have. Thanks in advance for any help |
| Comments |
| Comment by Peter Colclough [ 11/Mar/11 ] |
|
Thanks Andrew... and others. I had already read those sections. I realise we have a 'catch 22' here. I am off to play with some scenarios to see if we can 'automatically' recover, while emailing the sysadmins, and not killing the system while we are getting recovered. Thanks for your help |
| Comment by Andrew Armstrong [ 11/Mar/11 ] |
|
Try reading http://www.mongodb.org/display/DOCS/Reconfiguring+a+replica+set+when+members+are+down You may consider running an arbiter node on a separate machine (eg a web server) so you have an odd number of servers. The arbitrator as mentioned previously is very light weight, is not queried and has no data on it, all it does is cast a vote when as a decision maker when there are failures. |
| Comment by Peter Colclough [ 11/Mar/11 ] |
|
Hi Eliot, Thanks for the quick response. I kindda accept that, which is why I started with 3 nodes on a server..... which is a majority, unless they each vote for the next one down the line. I then doubled up teh servers to test the 'inter server operability' (I know... of COURSE it works And I now understand... you need a majority for the total number of servers......not a majority from 'working' servers..... Ok... so how do I add in a new instance on teh working server, to give me a majority, bearing in mind I can't change the config, as I don't have a primary... Thanks for the help Peter C – Peter C – |
| Comment by Kristina Chodorow (Inactive) [ 08/Mar/11 ] |
|
Thus, the recommended approach is to have an odd number of servers. See the "Rationale" section of http://www.mongodb.org/display/DOCS/Replica+Set+Design+Concepts. The short answer is: the system is self-monitoring, it elects a primary when it safely can. You can't have what you're looking for (always have a primary) automatically without ending up with multiple masters and, thus, the possibility of conflicting writes. It would be possible to allow more automatic reconfiguring of a set with no primary, but I don't think "most members unexpectedly and permanently go down" happens regularly for most people. |
| Comment by Peter Colclough [ 08/Mar/11 ] |
|
Ok, that worked... thanks. I still think it would be useful if we could 'programmatically' force a server to be a primary. This would allow a system to self monitor, and if this situation occurred, to at least allow for a system monitor to sort something out. Its a catch 22, because of the reasons you gave (ie not wanting to have 2 sets of primaries on teh same set), but also allowing for a primary to be deduced when one system goes down, leaving no majority. An arbiter would only work if on a third machine, as if each machine has an arbiter (in case the other goes down), then normal processing will fail, because 2 arbiters would negate the need for them (if you see what I mean). Conundrum.... |
| Comment by Kristina Chodorow (Inactive) [ 07/Mar/11 ] |
|
Shut down a server that could be primary (once your set is down to 3 servers) and restart it without the --replSet option and on a different port. Connect to it with the shell and modify the local.system.replset document to only have the 3 servers. Increment the version number and save the document back to the local.system.replset collection. Then restart the server on the correct port with --replSet and the other servers will pick up on the config change. e.g., going from four servers to two servers: $ mongo localhost:27021/local , , , { "_id" : 3, "host" : "ubuntu:27020" } ] > config.members.pop() { "_id" : 2, "host" : "ubuntu:27019" }> config.version++ , { "_id" : 1, "host" : "ubuntu:27018" }] } See also: http://www.mongodb.org/display/DOCS/Reconfiguring+a+replica+set+when+members+are+down |
| Comment by Peter Colclough [ 07/Mar/11 ] |
|
I see that issue now.. thanks. However, I am now in a situation where I have 3 nodes 'healthy', all of w3hom are secondaries, and it appears, no way of getting them to be a primary. I can't add an arbiter, as I need a primary to change the config through. Is there a way I can 'force' a primary, even it it means using the UI to do this? btw, 'freezing' standing down etc, also doesn't achieve this, as I am always in a minority. This is still a necessary function, as otherwise we would be in a 'mexican standoff' given the current scenario. I also don't see how voting changes/arbiters can actually help a scenario where a machine or two are taken out of service (or drop unexpectedly), leaving a minority behind... teh arbiter would have to be on a separate system, which always has access to all servers on that system...... |
| Comment by Kristina Chodorow (Inactive) [ 07/Mar/11 ] |
|
You can't elect a master based on the number of healthy nodes as then you could have a master on each side of a network partition. There is no way for a cluster of nodes to tell the difference between a network partition and nodes being down. You need a majority of the total number of nodes to elect a master. That's why we suggest having an odd number of nodes/an arbiter/giving a node one extra vote. |
| Comment by Peter Colclough [ 07/Mar/11 ] |
|
Eliot, This still remains an issue (imvho). If you use a majority of the actual servers, including those that are unreachable, you may never be able to get a usable system. For example, if we had 7 servers, split 4 on one machine and 3 on another, if we take the '3' off... all is fine and dandy. If we take the '4' out, then we have a single server with 3, but all secondaries. So the way around this is to have an arbiter. The arbiter would have to be on a third machine, so it isn't taken out if we down a server. Having an arbiter on one of teh main machines would simply cause an issue if that machine were taken out. If teh aribter were on a third machine, and that was taken out, we are back to square one again... if you see what I mean. Surely the 'voting' should take place between 'reachable ' systems that are 'healthy'. That way you can always have a majority with the working systems. Or am I really missing the point here? Thanks in advance Peter C |
| Comment by Eliot Horowitz (Inactive) [ 07/Mar/11 ] |
|
Looks like you have 3 nodes up and 3 nodes down. |