[SERVER-4700] Critical replication failures should bring server into RECOVERING state Created: 17/Jan/12  Updated: 30/Mar/12  Resolved: 18/Jan/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Aristarkh Zagorodnikov Assignee: Kristina Chodorow (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

The dreaded "replSet error RS102 too stale to catch up" in my opinion should mark server as RECOVERING or failed in any other way, because currently there is no any other way to determine if replication occured. Silent failures on the other hand lead to potential loss of data in emergency cases.



 Comments   
Comment by Aristarkh Zagorodnikov [ 18/Jan/12 ]

Sorry, checked the logs, it really moved to RECOVERING state. I guess I mixed that with I never got a warning from the MMS. I am very sorry for the false report again.

Comment by Aristarkh Zagorodnikov [ 18/Jan/12 ]

It seldomly occurs with one of our replica sets, I'll post logs when (if) it happens again.

Comment by Eliot Horowitz (Inactive) [ 17/Jan/12 ]

RS102 definitely makes server go into RECOVERING
Do you have a case showing otherwise?

Generated at Thu Feb 08 03:06:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.