[SERVER-39445] Executing remote command request during collection cloning in initial sync should not use RemoteCommandRequest::kNoTimeout. Created: 08/Feb/19 Updated: 27/Oct/23 Resolved: 06/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Backlog - Replication Team |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | former-quick-wins | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Replication
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Linked BF Score: | 19 | ||||
| Description |
|
Currently, during collection cloning phase, we execute lot of remote command request to sync target with RemoteCommandRequest::kNoTimeout. And, this can lead to issues where sync target can hang forever on those remote commands if there is some kind of network issue reaching the sync source. My suggestion would be setting some deadline to those remote commands like we do it for oplog fetching. When the command times out, initial sync fails. This at least will allow the sync target to retry the initial sync with different sync source. Below is the list of those remote commands issued with kNoTimeout. |
| Comments |
| Comment by Judah Schvimer [ 26/Sep/19 ] |
|
matthew.russotto, will this go away with Resumable Initial Sync's cloner refactor? |