[SERVER-33878] Update OplogFetcher to go into SyncSource selection on CappedPositionLost error Created: 14/Mar/18 Updated: 05/Dec/22 Resolved: 24/Jul/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Backlog - Replication Team |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||
| Description |
|
If we get it during secondary oplog fetching due to falling off the back of our sync source’s oplog, and retry the query (current behavior), we are guaranteed to go into rollback and fassert in rollback via refetch, or just fail to find a common point (which means reading the entire sync source oplog) in rollback to a stable timestamp. If we simply went back to sync source selection, we could skip all that and maybe find a better sync source that works, or log the “too stale” error like we expect. |
| Comments |
| Comment by Judah Schvimer [ 24/Jul/20 ] |
|
I checked with samy.lanka and we think this was fixed by |