[SERVER-28377] Do not check that remote last applied is ahead of local last fetched in OplogFetcher first batch during initial sync Created: 17/Mar/17  Updated: 07/Sep/17  Resolved: 03/Apr/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.4.4, 3.5.6

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Matthew Russotto
Resolution: Done Votes: 0
Labels: bkp
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-27403 Consider term and rbid when validatin... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: Repl 2017-04-17
Participants:
Linked BF Score: 0

 Description   

There is a race where the sync source's most recent oplog entries could become visible and be read by downstream nodes before the sync source updates its heartbeat map with its new last applied OpTime. This can cause downstream nodes to get stale lastAppliedOpTimes in metadata which can be a problem for OplogFetcher::checkRemoteOplogStart.

This will only cause the OplogFetcher to return early and choose a new sync source, so it should not cause harm beyond unnecessary sync source changes and some very quick initial sync restarts.

We should remove the check that the remote last applied OpTime is greater than or equal to the local last fetched OpTime in OplogFetcher::checkRemoteOplogStart when "requireFresherSyncSource" is false. This will also require changing the comments that explain the boolean's meaning.

An alternative is to use the max of the metadata lastOpApplied and the last OpTime in the batch as the remote last applied OpTime in OplogFetcher::checkRemoteOplogStart.



 Comments   
Comment by Githook User [ 06/Apr/17 ]

Author:

{u'username': u'mtrussotto', u'name': u'Matthew Russotto', u'email': u'matthew.russotto@10gen.com'}

Message: SERVER-28377 If first batch of OplogFetcher has a document ahead of the remote last applied from heartbeat, use the document's time instead.

(cherry picked from commit 925e245ca4cb59fdec3c008097df612fd48ae00a)
Branch: v3.4
https://github.com/mongodb/mongo/commit/2a5996b761a64787593192b6413bb774bea06da0

Comment by Githook User [ 03/Apr/17 ]

Author:

{u'username': u'mtrussotto', u'name': u'Matthew Russotto', u'email': u'matthew.russotto@10gen.com'}

Message: SERVER-28377 If first batch of OplogFetcher has a document ahead of the remote last applied from heartbeat, use the document's time instead.
Branch: master
https://github.com/mongodb/mongo/commit/925e245ca4cb59fdec3c008097df612fd48ae00a

Generated at Thu Feb 08 04:17:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.