[SERVER-29803] Add a 'tooStale' field to replSetGetStatus output when a node is in RECOVERING due to being too stale to sync from any available node Created: 22/Jun/17  Updated: 30/Oct/23  Resolved: 10/Jul/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.3.1

Type: Improvement Priority: Major - P3
Reporter: Shawn McCarthy (Inactive) Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 4
Labels: high-value, neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-12872 Investigate changes in SERVER-29803: ... Closed
Related
is related to SERVER-42510 Fix race in too_stale_secondary.js Closed
Backwards Compatibility: Minor Change
Sprint: Repl 2019-07-15
Participants:
Case:
Linked BF Score: 0

 Description   

When a mongod, which is part of a replica set, cannot recover because the oplog has rolled over, the mongod node stays in the RECOVERING state. There is no way to tell the difference between a node in RECOVERING that is actively applying oplog and will eventually transition to SECONDARY, and one that is in RECOVERING due to being too stale and will never recover unless a node becomes available that has an overlap with its oplog.

Currently the only way to get this is through the logs, which keeps tools like ops/cloud manager from easily detecting and alerting when a node falls off the back of all available sync source oplogs.



 Comments   
Comment by A. Jesse Jiryu Davis [ 10/Jul/19 ]

Answer from the Cloud team: they'd prefer that it isn't backported.

Comment by A. Jesse Jiryu Davis [ 10/Jul/19 ]

It should be pretty easy and Cloud has a ticket depending on this. cailin.nelson, a question for you please: how far back should this be ported for Cloud's sake?

Comment by Daniel Pasette (Inactive) [ 10/Jul/19 ]

will this be backported?

Comment by Githook User [ 10/Jul/19 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'username': 'ajdavis', 'email': 'jesse@mongodb.com'}

Message: SERVER-29803 Add replSetGetStatus field tooStale
Branch: master
https://github.com/mongodb/mongo/commit/1433d75e416e1078bb490ecda04c9e12b1a0ab3d

Comment by Alyson Cabral (Inactive) [ 10/Jul/19 ]

Yes, this works. Thanks, Jesse.

Comment by A. Jesse Jiryu Davis [ 08/Jul/19 ]

alyson.cabral my patch will add a field "tooStale: true" at the top level of the replSetGetStatus reply from a secondary when it can't recover because it's fallen off the oplog. In other states, there is no tooStale field in replSetGetStatus. Do you approve this change or would you like a different solution here?

Comment by Spencer Brody (Inactive) [ 27/Oct/17 ]

Moving this to the backlog as it seems it is no longer a requirement for the Cloud team

Generated at Thu Feb 08 04:21:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.