[SERVER-26253] Get new sync source rather than rollback on RemoteOplogStale error Created: 22/Sep/16  Updated: 29/Dec/16  Resolved: 29/Dec/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Judah Schvimer
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-27403 Consider term and rbid when validatin... Closed
Related
is related to SERVER-27403 Consider term and rbid when validatin... Closed
Operating System: ALL
Sprint: Repl 2017-01-23
Participants:
Linked BF Score: 15

 Description   

A RemoteOplogStale error means that the OplogFetcher got an empty batch for its remote oplog query with a predicate that is greater than or equal to its own most recent oplog entry. This means that its sync source's most recent oplog entry is older than it's own most recent oplog entry. This should lead to getting a new sync source rather than a rollback.



 Comments   
Comment by Spencer Brody (Inactive) [ 29/Dec/16 ]

We have to be careful here. If we never roll back on RemoteOplogStale, we could wind up spinning looking for a sync source and not rolling back when we need to. We should probably figure out a unified solution for this and SERVER-27403 at the same time.

Comment by Eric Milkie [ 23/Sep/16 ]

Scott, you just described the exact situation that occurs when the resync command runs. So switching sync sources would be the correct thing to do in that case. You would never want to roll back in such a situation, as you can always wait until some node gets ahead of you again (and then decide for real if you should roll back).

Comment by Scott Hernandez (Inactive) [ 22/Sep/16 ]

The sync source choosing code only returns a member to sync from that is ahead, so this behavior is correct unless the upstream node goes back in time/oplog-stream.

Generated at Thu Feb 08 04:11:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.