[SERVER-63120] Handle recipient secondary failures while performing file copy based cloning procedure. Created: 28/Jan/22  Updated: 03/Mar/23  Resolved: 03/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Suganthi Mani Assignee: [DO NOT USE] Backlog - Server Serverless (Inactive)
Resolution: Won't Do Votes: 0
Labels: shard-merge-milestone-3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-61144 Finish importing donated collections ... Closed
Duplicate
duplicates SERVER-63390 Abort merge on error from OpObserver ... Closed
Assigned Teams:
Serverless
Participants:

 Description   

Currently I am not clear on how we handle errors on recipient secondaries while copying & importing donor files. 

1) Say copying a particular donor file failed on recipient secondary, how do we prevent recipient secondaries from copying subsequent donor files? The current design is that for each entry in the backup cursor response (donor files to be imported list), we generate an oplog entry and an op observer for that entry would trigger the copying of that particular file. 

2) Does recipient secondary need to inform recipient primary to abort merge?



 Comments   
Comment by Suganthi Mani [ 03/Mar/23 ]

We agreed in shard merge design, R primary waits for all nodes to vote only for a tunable time period. And, R primary to wait for the votes for a tunable time period. Not receiving votes within that time period, will fail the merge

Comment by Suganthi Mani [ 28/Jan/22 ]

CC jesse

Generated at Thu Feb 08 05:56:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.