[SERVER-13222] Secondary keeps getting into an inconsistent state Created: 16/Mar/14  Updated: 10/Dec/14  Resolved: 17/Mar/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.8, 2.4.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ryan Witt Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-7200 use oplog as op buffer on secondaries Closed
Operating System: ALL
Participants:

 Description   

I've resynced this server several times but keep getting something like the following:

***** SERVER RESTARTED *****
 
Sun Mar 16 16:02:24.567 [websvr] admin web console waiting for connections on port 28017
Sun Mar 16 16:02:30.714 [rsStart] replSet I am m4.example.com:27017
Sun Mar 16 16:02:36.716 [rsStart] replSet STARTUP2
Sun Mar 16 16:02:36.721 [rsHealthPoll] replSet member m5.example.com:27017 is up
Sun Mar 16 16:02:36.721 [rsHealthPoll] replSet member m5.example.com:27017 is now in state PRIMARY
Sun Mar 16 16:02:37.717 [rsSync] replSet still syncing, not yet to minValid optime 5325bf2f:1f
Sun Mar 16 16:02:38.724 [rsHealthPoll] replSet member a1..example.com:27017 is up
Sun Mar 16 16:02:38.724 [rsHealthPoll] replSet member a1.incrowdads.net:27017 is now in state ARBITER
Sun Mar 16 16:02:38.849 [rsHealthPoll] replSet member m8.example.com:27017 is up
Sun Mar 16 16:02:38.849 [rsHealthPoll] replSet member m8.example.com:27017 is now in state SECONDARY
Sun Mar 16 16:02:43.718 [rsBackgroundSync] replSet syncing to: m5.example.com:27017
Sun Mar 16 16:02:43.719 [rsSync] replSet still syncing, not yet to minValid optime 5325bf2f:1f
Sun Mar 16 16:02:43.969 [rsBackgroundSync] replSet our last op time fetched: Mar 16 15:11:34:2c
Sun Mar 16 16:02:43.969 [rsBackgroundSync] replset source's GTE: Mar 16 15:11:39:1
Sun Mar 16 16:02:43.969 [rsBackgroundSync] replSet need to rollback, but in inconsistent state
Sun Mar 16 16:02:43.969 [rsBackgroundSync] minvalid: 5325bf2f:1f our last optime: 5325bf26:2c
Sun Mar 16 16:02:43.969 [rsBackgroundSync] replSet FATAL

What's happening here? What other information do I need to look at to make sense of this?



 Comments   
Comment by Ryan Witt [ 17/Mar/14 ]

Thanks Eric, Dan.

Comment by Daniel Pasette (Inactive) [ 17/Mar/14 ]

see: http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/

Comment by Eric Milkie [ 17/Mar/14 ]

From the logs, it looks like your secondary crashed while it was in the middle of updating some operations, and when it came back up, those same operations on its sync source no longer exist – which means they may have been rolled back. In order to roll back these operations on the secondary, it needs to know what those operations were.
SERVER-7200 will fix this issue. To fix this secondary for now, you must resync it.

Generated at Thu Feb 08 03:31:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.