[SERVER-63120] Handle recipient secondary failures while performing file copy based cloning procedure. Created: 28/Jan/22 Updated: 03/Mar/23 Resolved: 03/Mar/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | [DO NOT USE] Backlog - Server Serverless (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | shard-merge-milestone-3 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Serverless
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Currently I am not clear on how we handle errors on recipient secondaries while copying & importing donor files. 1) Say copying a particular donor file failed on recipient secondary, how do we prevent recipient secondaries from copying subsequent donor files? The current design is that for each entry in the backup cursor response (donor files to be imported list), we generate an oplog entry and an op observer for that entry would trigger the copying of that particular file. 2) Does recipient secondary need to inform recipient primary to abort merge? |
| Comments |
| Comment by Suganthi Mani [ 03/Mar/23 ] |
|
We agreed in shard merge design, R primary waits for all nodes to vote only for a tunable time period. And, R primary to wait for the votes for a tunable time period. Not receiving votes within that time period, will fail the merge |
| Comment by Suganthi Mani [ 28/Jan/22 ] |
|
CC jesse |