Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 1.8.1
Component/s: Replication
Labels:
None
Environment:
linux, ec2, ebs

CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

During the massive EC2 fail earlier this morning, the master of one of our replica set was impacted, not responding to the clients still connected without closing the connections. The other members of the set did not pick the failure up, and it was not possible to send a "stepdown" command to it. (As the computer was not answering ssh, we did a remote reboot to force the replica set on its two other feet).

the replica failure detection should be less optimistic
It should be possible to trigger election from a secondary in such a situation

duplicates

SERVER-3014 DBClientConnection socket timeout doesn't work correctly

Closed

Assignee:: Kristina Chodorow (Inactive)
Reporter:: Mathieu Poumeyrol
Participants:: Eliot Horowitz, Jonathan Wollman, Kristina Chodorow, Mathieu Poumeyrol
Votes:: 2 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Apr 21 2011 11:48:13 AM UTC
Updated:: May 29 2012 02:53:16 PM UTC
Resolved:: May 02 2011 04:02:41 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates