[SERVER-20669] Unable to init mongodb node: rollback error RS100 reached beginning of remote oplog Created: 28/Sep/15  Updated: 03/Jan/17  Resolved: 28/Sep/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Lucas Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-27493 Reverse oplog cursors can return earl... Closed
Operating System: ALL
Participants:

 Description   

MongoVersion: 3.0.6
WiredTiger.

My mongo node stop responding at 10 AM today. With this log:

2015-09-28T05:04:43.706-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.706-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.706-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.706-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.706-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.707-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.707-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.707-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.707-0500 I REPL     [rsBackgroundSync] replSet our last op time fetched: Sep 28 04:19:00:a3
2015-09-28T05:04:43.707-0500 I REPL     [rsBackgroundSync] replset source's GTE: Sep 28 04:19:01:1
2015-09-28T05:04:43.707-0500 I QUERY    [conn12126] assertion 13435 not master and slaveOk=false ns:db.colname query:{ active: true }
2015-09-28T05:04:43.707-0500 I REPL     [rsBackgroundSync] beginning rollback
2015-09-28T05:04:43.707-0500 I REPL     [rsBackgroundSync] rollback 0
2015-09-28T05:04:43.707-0500 I REPL     [rsBackgroundSync] rollback 1
2015-09-28T05:04:43.708-0500 I REPL     [rsBackgroundSync] rollback 2 FindCommonPoint
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet info rollback our last optime:   Sep 28 04:19:00:a3
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet info rollback their last optime: Sep 28 05:04:41:25
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet info rollback diff in end of log times: -2741 seconds
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet rollback error RS100 reached beginning of remote oplog
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet   them:      mongo2:27017 (10.0.0.2) scanned: 1
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet   theirTime: Sep 28 05:04:41 560910b9:25
2015-09-28T05:04:44.212-0500 I REPL     [rsBackgroundSync] replSet   ourTime:   Sep 28 04:19:00 56090604:a3
2015-09-28T05:04:44.212-0500 E REPL     [rsBackgroundSync] RS100 reached beginning of remote oplog [1]
2015-09-28T05:04:44.220-0500 I -        [rsBackgroundSync] Fatal Assertion 18752
2015-09-28T05:04:44.221-0500 I -        [rsBackgroundSync]
 
***aborting after fassert() failure

Since that, my mongo was trying to init but every time this error occurs:

2015-09-28T13:16:32.081-0500 E REPL     [rsBackgroundSync] sync producer problem: 10278 dbclient error communicating with server: mongo2:27017
2015-09-28T13:16:32.081-0500 I REPL     [ReplicationExecutor] syncing from: mongo2:27017
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] replSet our last op time fetched: Sep 28 04:19:00:a3
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] replset source's GTE: Sep 28 04:19:01:1
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] beginning rollback
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] rollback 0
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] rollback 1
2015-09-28T13:16:32.973-0500 I REPL     [rsBackgroundSync] rollback 2 FindCommonPoint
2015-09-28T13:16:33.402-0500 I REPL     [rsBackgroundSync] replSet info rollback our last optime:   Sep 28 04:19:00:a3
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet info rollback their last optime: Sep 28 13:16:27:2f
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet info rollback diff in end of log times: -32247 seconds
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet rollback error RS100 reached beginning of remote oplog
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet   them:      mongo2:27017 (10.0.0.2) scanned: 1
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet   theirTime: Sep 28 13:16:27 560983fb:2f
2015-09-28T13:16:33.403-0500 I REPL     [rsBackgroundSync] replSet   ourTime:   Sep 28 04:19:00 56090604:a3
2015-09-28T13:16:33.403-0500 E REPL     [rsBackgroundSync] RS100 reached beginning of remote oplog [1]
2015-09-28T13:16:33.403-0500 I -        [rsBackgroundSync] Fatal Assertion 18752
2015-09-28T13:16:33.403-0500 I -        [rsBackgroundSync]
 
***aborting after fassert() failure

What is going on?



 Comments   
Comment by Kelsey Schubert [ 03/Jan/17 ]

Hi lucasoares,

I wanted to update you that we've identified a bug that explains the behavior you previously observed. While, it's true that it is possible for a secondary to fall off a replica set if the replication lag becomes larger than the oplog size, in this case, it appears that you encountered SERVER-27493.

Please note that MongoDB 3.2 and later are not affected by this issue.

Sorry for the delay tracking this down.

Kind regards,
Thomas

Comment by Lucas [ 15/Oct/15 ]

Thanks.

Comment by Ramon Fernandez Marina [ 06/Oct/15 ]

lucasoares, it is possible for a secondary to fall off a replica set if the replication lag becomes larger than the oplog size. If you need a bigger oplog you can read about changing oplog size here, and the documentation about the oplog here.

Regards,
Ramón.

Comment by Lucas [ 06/Oct/15 ]

And this isn't a bug? You mean that my mongo stop for a while and can't recover normally.

Comment by Ramon Fernandez Marina [ 28/Sep/15 ]

lucasoares, looks like you'll need to resync this node.

Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Generated at Thu Feb 08 03:54:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.