[SERVER-24088] oplog fetcher should retry on getMore ExceededTimeLimit Created: 06/May/16  Updated: 22/Mar/18  Resolved: 23/May/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Judah Schvimer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-24113 OplogFetcher getMore callback QueryRe... Closed
Related
related to SERVER-23222 Replset metadata is not attached to c... Closed
related to SERVER-25702 add support to OplogFetcher for resta... Closed
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

Currently if a getMore fails with an ExceededTimeLimit error, such as if there is no data to return, the fetcher will shutdown, and the fetcher will then look for a new sync source.

We should fix this so that if a getMore in the fetcher fails with this error, we issue a new getMore instead. We need to make sure we get and process the metadata as well. So further work will need to be done to return the metadata with the QueryResponseStatus, and then update the server with that metadata as we do on successful getMores.



 Comments   
Comment by Eric Milkie [ 16/May/16 ]

If the getMore returned ExceededTimeLimit, that implies that there was a problem on the upstream node (e.g. the oplog query took too long to complete). Therefore, choosing a new sync source is probably the right thing to do.

Comment by Judah Schvimer [ 10/May/16 ]

As part of this work we will need to change the fetcher to store the cursorId and namespace used for the initial find, since they will not be returned in the error case and are needed for subsequent getMores. Rather than parsing them out from the previous response, the fetcher can just invariant that they haven't changed.

Comment by Judah Schvimer [ 09/May/16 ]

When we process the metadata we should only set the lastCommittedOptime and update the term if it moves them forward.

Generated at Thu Feb 08 04:05:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.