[SERVER-43118] Primary randomly steps down without being explicitly told Created: 01/Sep/19 Updated: 11/Oct/19 Resolved: 11/Oct/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Chad Kreimendahl | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | It begins:
.... a bunch of clients reconnecting and queries in the queue executing
.... a bunch of clients reconnecting
.... a bunch of clients reconnecting. And then it decides it wants to be PRIMARY again:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: |
| Description |
|
One of our clusters (out of dozens), will randomly tell itself to step down from being primary, without any commands being sent to the system to tell it.
We see it approximately 1-2 times per week, and always either in the valleys or peaks of our usage. |
| Comments |
| Comment by Danny Hatcher (Inactive) [ 11/Oct/19 ] |
|
I'm going to close this ticket for now then. If you do see it happen again, please leave a comment and we will re-open. |
| Comment by Chad Kreimendahl [ 08/Oct/19 ] |
|
It has not. Though I suspect in large part that is due to our moving about 99.5% of the databases (by table count and size) off of the system and on to alternate systems. |
| Comment by Danny Hatcher (Inactive) [ 01/Oct/19 ] |
|
mongo@phish.org, did you ever see this happen again? |
| Comment by Chad Kreimendahl [ 05/Sep/19 ] |
|
I'll need to wait for it to happen again. I'll check each morning and get you that once we see it again. Will likely be in the next week, if history follows. |
| Comment by Danny Hatcher (Inactive) [ 03/Sep/19 ] |
|
Hello Chad, In order for use to diagnose, can you please upload the full mongod logs covering at least one of these problem time frames as well as the "diagnostic.data" folder for every node in the replica set? You can use our Secure Uploader so that only MongoDB engineers will be able to access them. |
| Comment by Chad Kreimendahl [ 01/Sep/19 ] |
|
Forgot: Version 3.4.22 - 4 member replicaset. significant cpu, memory and disk available. |