[SERVER-30873] Investigate where the oplog truncation happens during replication recovery Created: 29/Aug/17 Updated: 27/Oct/23 Resolved: 12/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Judah Schvimer |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Backport Requested: |
v3.4
|
||||
| Sprint: | Repl 2017-10-02 | ||||
| Participants: | |||||
| Description |
|
The comments for ReplicationRecovery::_truncateOplogTo indicate that it should truncate entries after and including 'truncateTimestamp'. The implementation, however, finds the first oplog entry less than the 'truncateTimestamp' then tells the storage engine to truncate the oplog starting at that entry, and passes true for the 'inclusive' flag. This means it will actually truncate the entry one before the 'truncateTimestamp' passed it. We need to audit exactly how and when the oplogTruncateAfterPoint is set and make sure we're truncating at the right points. |
| Comments |
| Comment by Daniel Gottlieb (Inactive) [ 12/Sep/17 ] |
|
That makes sense. What I was doing with integrating rollback via recovery was observing some problem in this area, but perhaps I misdiagnosed it. I'm with Eric; let's close this for now and I'm sure if there's a real problem it will show up again when the project is revisited. |
| Comment by Eric Milkie [ 12/Sep/17 ] |
|
There might still be some bug here, as I believe the problem involves the intersection of the stable timestamp and the truncation point. The unit test is just hard coding some value for the truncate-after-point, so while the logic is confirmed correct for truncation point, there may still be a problem at a higher layer. I think we can resolve this particular ticket, though, and reopen a new ticket after we experience the problem again. |
| Comment by Spencer Brody (Inactive) [ 12/Sep/17 ] |
|
Makes sense to me. This came up because of an issue daniel.gottlieb was having in his testing. Dan it seems like there's no bug here after all. Do your findings disagree with that? FYI milkie |
| Comment by Judah Schvimer [ 12/Sep/17 ] |
|
I believe this is actually fine and the code is doing what it's supposed to be doing as evidenced by this unittest: https://github.com/mongodb/mongo/blob/bddfee0d34513631424363645139f8eb4acbbc4c/src/mongo/db/repl/replication_recovery_test.cpp#L263-L279 The code is a bit confusing, but we truncate the oplog at `oldestIDToDelete`, not at the current RecordID we're looking at. We search backwards through the oplog, checking if we're past the `truncateTimestamp` at each point. If we are not past the `truncateTimestamp` (greater than or equal to it), then we set `oldestIDToDelete` equal to the current oplog entry we are looking at. If we are past the `truncateTimestamp`, we do NOT set the `oldestIDToDelete` again, and pass the previous `oldestIDToDelete` into `cappedTruncateAfter`, which must be greater than or equal to `truncateTimestamp`. Example: We have an oplog [1,2,3] and we want to truncate at 2 yielding [1] afterwards. We begin looking at 3. 3 is not less than 2, so we set `oldestIDToDelete` to 3. We then look at 2. 2 is not less than 2 so we set `oldestIDToDelete` to 2. We then look at 1. 1 is less than 2, so we truncate at `oldestIDToDelete`(inclusive) which is 2. This leads to our desired oplog of [1]. spencer, do you agree with this reading and does the unittest suffice? |
| Comment by Spencer Brody (Inactive) [ 29/Aug/17 ] |
|
This also appears to be an issue on older versions as well. |