[SERVER-21988] Rollback does not wait for applier to finish before starting Created: 22/Dec/15  Updated: 15/Nov/21  Resolved: 08/Jan/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.1
Fix Version/s: 3.2.3, 3.3.0

Type: Bug Priority: Major - P3
Reporter: Scott Hernandez (Inactive) Assignee: Scott Hernandez (Inactive)
Resolution: Done Votes: 0
Labels: code-and-test
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-20466 Rollback and batch apply can each set... Closed
is duplicated by SERVER-21979 MongoD instance came down Closed
is duplicated by SERVER-22482 Cache growing to 100% followed by crash Closed
is duplicated by SERVER-23525 Mongo Replica Set unusable after Pri... Closed
Related
related to SERVER-25145 During rollback (or w/minvalid invali... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Repl E (01/08/16)
Participants:
Linked BF Score: 0

 Description   

This error message is a bit mislead as well. The error is due to trying to re-apply the last/current oplog entry again.

[rsSync] Fatal assertion 34361 OplogOutOfOrder Attempted to apply an earlier oplog entry (ts: Dec 18 01:16:07:1) when our lastWrittenOptime was (term: 2, timestamp: Dec 18 01:16:07:1)

After looking a bit more into this, I have found this sequence which caused the problem:

  • Chained replication
  • Failover with rollback
  • Chained replica starts rollback while still apply with buffered oplog entries

Under these conditions it is race where this error occurs when the rollback happens during/before all the buffered oplog entries are applied.



 Comments   
Comment by Ramon Fernandez Marina [ 26/Aug/16 ]

zyd_com, I'm wondering whether your're seeing the same symptom triggered by a different bug. Will you please open a new ticket and provide full logs for the affected node?

Thanks,
Ramón.

Comment by Zhang Youdong [ 25/Aug/16 ]

2016-08-24T19:02:44.644+0800 I -        [rsSync] Fatal assertion 34361 OplogOutOfOrder: Attempted to apply an earlier oplog entry (ts: Aug 24 19:02:29:1) when our lastWrittenOptime was (term: -1, timestamp: Aug 24 19:02:29:4d)
2016-08-24T19:02:44.647+0800 I CONTROL  [rsSync]

I encounter this problem when using mongodb-3.2.3,I am not sure if it's the same question? I want to know the detailed case for this problem,anybody can help me ?

Comment by Githook User [ 19/Jan/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-21988: wait for applier before starting rollback

(cherry picked from commit 4846585c6a7f09a18ac8c313ca7b0cee405ad29c)
Branch: v3.2
https://github.com/mongodb/mongo/commit/05a115fe4e87db76c16fab5d17e23ac45329994d

Comment by Githook User [ 08/Jan/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-21988: wait for applier before starting rollback
Branch: master
https://github.com/mongodb/mongo/commit/4846585c6a7f09a18ac8c313ca7b0cee405ad29c

Comment by Githook User [ 05/Jan/16 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-21988: change outoforder error message logging.
Branch: master
https://github.com/mongodb/mongo/commit/aa4c76484f21515383197e3380060abac45b7b9c

Comment by Githook User [ 22/Dec/15 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-21988: do not fassert when re-applying same oplog entry till we fix the buffering issue
Branch: v3.2
https://github.com/mongodb/mongo/commit/3905aa7b425fee814d67dd9ea8fb1ec13f3ecd2f

Generated at Thu Feb 08 03:59:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.