Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 1.8.0-rc0, 1.8.0-rc1
Component/s: Admin, Replication, Usability
Labels:
None
Environment:
Ubuntu 10 64 bit, 8gig memory... too much disk to worry about

Operating System:
Linux
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Firstly...a s a new user... brilliant package.... thanks. (And stupidly I posted this on the Ubuntu/mongo log as well... sorry... monday morning syndrome)

Now.. I have 6 instances in a replication set, spread over 2 physical machines. All works fine. If I then take down one of the machines, I end up with 3 instances, all being secondaries. This is a basic setup with default voting rights, and no arbiter.
The result of a rs.status() is below:

mycache:SECONDARY> rs.status()
{
"set" : "mycache",
"date" : ISODate("2011-03-04T15:49:01Z"),
"myState" : 2,
"members" : [
{
"_id" : 0,
"name" : "n.n.n.1:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 202,
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"lastHeartbeat" : ISODate("2011-03-04T15:49:01Z")
},
{
"_id" : 1,
"name" : "n.n.n.2:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"self" : true
},
{
"_id" : 2,
"name" : "n.n.n.3:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 202,
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"lastHeartbeat" : ISODate("2011-03-04T15:49:01Z")
},
{
"_id" : 3,
"name" : "n.n.1.1:27017",
"health" : 0,
"state" : 2,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"lastHeartbeat" : ISODate("2011-03-04T15:46:45Z"),
"errmsg" : "socket exception"
},
{
"_id" : 4,
"name" : "n.n.1.2:27018",
"health" : 0,
"state" : 1,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"lastHeartbeat" : ISODate("2011-03-04T15:46:45Z"),
"errmsg" : "socket exception"
},
{
"_id" : 5,
"name" : "n.n.1.3:27019",
"health" : 0,
"state" : 2,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" :

{ "t" : 1299250255000, "i" : 1 }

,
"optimeDate" : ISODate("2011-03-04T14:50:55Z"),
"lastHeartbeat" : ISODate("2011-03-04T15:46:45Z"),
"errmsg" : "socket exception"
}
],
"ok" : 1
}

1. I tried reconfig, but that needs a primary, which I don't have.
2. Tried taking an instance down, freezing the other two, and bringing the third back up.... came back as a secondary.
3. Am going to try creating a new instance, and setting up as an arbiter, to see if that can help find a primary. However, this is not a long term solution. (see 4 below)
4. If I have more than one machine taking part in a replication set, in theory, for a resilient system, each machine would need to have an arbiter, in case another machine got taken out. With an even number of machines, that gives uis an even number of arbiters, which doesn't help if they are all in play (unless I am missing something obvious.... not for teh first time ).
If, however, we assign bitwise voting rights to each instance in a replicatuion set (1,2,4,8,16 .....), then any instance can be downed, or a whole machine can be downed, and a definite primary will also be voted in. This removes the need for an arbiter, and also gives the admins a chance to prioritise the servers taking part.... but I need a primary to change the config.

Thanks in advance for any help

Assignee:: Unassigned
Reporter:: Peter Colclough
Participants:: Andrew Armstrong, Eliot Horowitz, Kristina Chodorow, Peter Colclough
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Mar 07 2011 12:08:49 PM UTC
Updated:: Mar 30 2012 02:26:50 PM UTC
Resolved:: Mar 07 2011 01:09:39 PM UTC