[SERVER-29803] Add a 'tooStale' field to replSetGetStatus output when a node is in RECOVERING due to being too stale to sync from any available node Created: 22/Jun/17 Updated: 30/Oct/23 Resolved: 10/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Shawn McCarthy (Inactive) | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 4 |
| Labels: | high-value, neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||
| Sprint: | Repl 2019-07-15 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||
| Description |
|
When a mongod, which is part of a replica set, cannot recover because the oplog has rolled over, the mongod node stays in the RECOVERING state. There is no way to tell the difference between a node in RECOVERING that is actively applying oplog and will eventually transition to SECONDARY, and one that is in RECOVERING due to being too stale and will never recover unless a node becomes available that has an overlap with its oplog. Currently the only way to get this is through the logs, which keeps tools like ops/cloud manager from easily detecting and alerting when a node falls off the back of all available sync source oplogs. |
| Comments |
| Comment by A. Jesse Jiryu Davis [ 10/Jul/19 ] |
|
Answer from the Cloud team: they'd prefer that it isn't backported. |
| Comment by A. Jesse Jiryu Davis [ 10/Jul/19 ] |
|
It should be pretty easy and Cloud has a ticket depending on this. cailin.nelson, a question for you please: how far back should this be ported for Cloud's sake? |
| Comment by Daniel Pasette (Inactive) [ 10/Jul/19 ] |
|
will this be backported? |
| Comment by Githook User [ 10/Jul/19 ] |
|
Author: {'name': 'A. Jesse Jiryu Davis', 'username': 'ajdavis', 'email': 'jesse@mongodb.com'}Message: |
| Comment by Alyson Cabral (Inactive) [ 10/Jul/19 ] |
|
Yes, this works. Thanks, Jesse. |
| Comment by A. Jesse Jiryu Davis [ 08/Jul/19 ] |
|
alyson.cabral my patch will add a field "tooStale: true" at the top level of the replSetGetStatus reply from a secondary when it can't recover because it's fallen off the oplog. In other states, there is no tooStale field in replSetGetStatus. Do you approve this change or would you like a different solution here? |
| Comment by Spencer Brody (Inactive) [ 27/Oct/17 ] |
|
Moving this to the backlog as it seems it is no longer a requirement for the Cloud team |