[SERVER-34215] Clearing initial sync flag should clear oplog truncate after point Created: 30/Mar/18  Updated: 29/Oct/23  Resolved: 11/Jun/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.0-rc5, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Vesselina Ratcheva (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Repl 2018-05-07, Repl 2018-05-21, Repl 2018-06-04, Repl 2018-06-18
Participants:

 Description   

If a user crashed with an oplogTruncateAfterPoint set, deleted their oplog, went into initial sync, didn't apply any operations, and then cleared their initial sync flag, and then crashed, they could start up with a stale oplogTruncateAfterPoint and delete part of their oplog. If that oplog was used to commit a majority write (which it can be since initial sync nodes send progress updates), this could lead to majority writes being rolled back.



 Comments   
Comment by Githook User [ 11/Jun/18 ]

Author:

{'username': 'vessy-mongodb', 'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@10gen.com'}

Message: SERVER-34215 Clear oplog truncate after point when clearing initial sync flag

(cherry picked from commit f320f151e0a709dafc8f8359ab7fab303a7f74a9)
Branch: v4.0
https://github.com/mongodb/mongo/commit/ff650c07cf0ef56f1621a257a354fa09038ff01e

Comment by Githook User [ 11/Jun/18 ]

Author:

{'username': 'vessy-mongodb', 'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@10gen.com'}

Message: SERVER-34215 Clear oplog truncate after point when clearing initial sync flag
Branch: master
https://github.com/mongodb/mongo/commit/f320f151e0a709dafc8f8359ab7fab303a7f74a9

Comment by Judah Schvimer [ 30/Mar/18 ]

I didn't actually see this occur, I just realized it by code inspection. It's possible something subtle prevents this from happening and it's obviously a very narrow race, but I think it could occur. We clear the appliedThrough and minValid explicitly with the initial sync flag, so clearing the oplogTruncateAfterPoint as well seems reasonable.

Generated at Thu Feb 08 04:35:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.