[SERVER-27815] always write seed doc to oplog during initial sync Created: 25/Jan/17 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Replication
|
||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
We currently only write the seed doc to the oplog when there are no other operations to apply. When there are operations to apply, however, we should still write this document so that other nodes that only have the seed doc in their oplog can use us as a sync source. This is safe because there is nothing special about that operation and all of the same idempotency guarantees that make the next oplog entry (which is in the oplog) safe, still apply. |
| Comments |
| Comment by Spencer Brody (Inactive) [ 02/Jun/17 ] |
|
Now that |
| Comment by Spencer Brody (Inactive) [ 07/Feb/17 ] |
|
I see, so I think the thing to do would be to always write down the oplog entry corresponding to the 'beginTimestamp' at the beginning of initial sync, then remove the part where we write down the entry corresponding to 'endTimestamp' if beginTimestamp and endTimestamp are the same. |
| Comment by Judah Schvimer [ 01/Feb/17 ] |
|
Here is where we currently seed the oplog if there are no oplog entries to add. If there would be no ops in the oplog, we put in the last oplog entry from the sync source as our only oplog entry. If we do have oplog entries to apply, then we put in all of them except for the first one we fetch or the "beginTimestamp". This is usually fine, but causes problems when some nodes only receive the "initiate" oplog entry and other nodes receive all of the oplog entries after the "initiate"oplog entry. They then cannot sync from each other. If we always seeded the oplog then this type of race would be much harder to hit, and would be impossible in this specific (and most common) case. |
| Comment by Spencer Brody (Inactive) [ 30/Jan/17 ] |
|
judah.schvimer I'm not sure I fully understand the proposed change here. Can you elaborate on what the 'seed doc' is in this case? |