[SERVER-17598] Election happens frequently on primary shard Created: 16/Mar/15 Updated: 08/Apr/15 Resolved: 08/Apr/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | 2.6.6 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Kazuo Yagi | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
To whom it may concern, I have a problem that election happens frequently on primary shard in sharded-cluster. These elections seem to happen in accordance with the failures of heartbeats between primary and secondary although there doesn't seem to be any network problems. Neither the network traffic nor the amount of connections is far from limits(traffic: 5Mbps to 15Mbps, connections: 30 to 60). Heartbeat has been continuously failing. As the probable result of it, election happens as well from 10 to 20 times a day on the primary shard, while once or twice at the most on the other normal shards. Is there any good way to stabilize the primary shard status? Our application fails every time the election happens and then it has to wait until a new primary is elected. This problem significantly affects our application performance. I would appreciate if you could give me any help to solve it. Best Regards,
|
| Comments |
| Comment by Ramon Fernandez Marina [ 01/Apr/15 ] |
|
ka_yagi@fancs.com, we haven't heard back from you for some time. If this is still an issue for you can you please provide the logs requested by Sam so we can investigate? Thanks, |
| Comment by Sam Kleinman (Inactive) [ 16/Mar/15 ] |
|
Without more information about your deployment, it's difficult to asses the root cause of this issue. Unnecessary failover events occur most commonly when there's sort of network configuration error. If you can provide logs from all members of the affected replica set, we can look over them and see if there's an obvious root clause. |