[SERVER-5217] Replica set fail-over on high volume latency Created: 06/Mar/12 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.0.1, 2.0.2 |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Sebastian Dahlgren | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | replication | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Debian GNU/Linux 6 |
||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Replication
|
||||||||
| Participants: | |||||||||
| Description |
|
MongoDB's heartbeat function does not monitor the health of the disk writes / reads. So in case the underlying disks on the primary node are having problems MongoDB will not switch primary. I would like a feature in the heartbeat function that includes health checking the read/write performance. It would probably be good if this more extensive heartbeat function is optional. See the discussion on mongodb-user maillist https://groups.google.com/forum/?fromgroups#!starred/mongodb-user/gY7r3f-yz0k. Right now the only option for us when a node has disk problems is to stop the mongod process in order to force a change of primary node. |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 06/Mar/12 ] |
|
Sorry - I didn't mean this would increase the load, I mean a short term user load spike could flip the set unnecessarily. Or an overall increase could just cause the set to flip back and forth constantly. |
| Comment by Sebastian Dahlgren [ 06/Mar/12 ] |
|
Thanks for the quick feedback Elliot. Two thoughts:
|
| Comment by Eliot Horowitz (Inactive) [ 06/Mar/12 ] |
|
Interesting, but tricky.
This is really something ec2/ebs specific thing... What might make more sense is a hook such that you can specify a binary to execute that determines "health". |