[SERVER-51158] Must not truncate entire oplog before truncate point Created: 25/Sep/20 Updated: 29/Oct/23 Resolved: 12/Nov/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0-alpha0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Sprint: | Repl 2020-10-19, Repl 2020-11-02, Repl 2020-11-16 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Linked BF Score: | 28 | ||||||||||||||||||||||||
| Description |
|
As a result of the Remove Stable Optime Candidates List (PM-1713) project, it is possible to have a case where there are no oplog entries before the oplog truncate after point (computer from the all-durable timestamp, which is at or after the stable optime candidate). This happens when an oplog hole is open long enough that the size of the oplog entries after the hole is bigger than the configured oplog size, and so all entries prior to the stable timestamp get truncated We cannot handle this case; we need an oplog entry before the truncate point to know when to start fetching. So we must ensure during truncation that we leave a record at or before the truncate point. |
| Comments |
| Comment by Matthew Russotto [ 19/Feb/21 ] |
|
nb: There are two "truncate points", the replication truncate-after point and the storage mayTruncateUpTo point, which is limited by a number of things, including that it must be no later than the stable timestamp. The mayTruncateUpTo point is always earlier than the truncate-after point. The CR ensures there is always at least one oplog entry less than or equal to the mayTruncateUpTo point. |
| Comment by Githook User [ 12/Nov/20 ] |
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: |
| Comment by Matthew Russotto [ 12/Nov/20 ] |
|
I have attached a reproducer for the record; I believe it is far too fragile to include in the test suites. |
| Comment by Matthew Russotto [ 04/Nov/20 ] |
|
Finally managed to reproduce the bug. While it can happen even with EMRC true, it requires writeConcernMajorityJournalDefault: false also, on a single-voting-node replica set. If the replica set has multiple voting nodes the majority cannot advance past the last real oplog entry and thus this can't happen. If writeConcernMajorityJournalDefault is true, the majority write concern cannot advance until we're past the hole. |