[SERVER-48255] Don't run waitForReplication hook while test is running a full collection validation Created: 15/May/20 Updated: 27/Oct/23 Resolved: 26/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.5 Required, 4.4 Required |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jason Chan | Assignee: | Backlog - Replication Team |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Replication
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Linked BF Score: | 36 | ||||
| Description |
|
Recently, full collection validation has started taking an exceptionally long time in the backup_restore tests. Since the validate cmd will grab the PBWM lock, it will block oplog application, and end up timing out any calls to awaitReplication. Ideally, we should be able to fix this by improving the performance of collection validation, but a quicker fix might be to see if we can avoid calling awaitReplication while in a validate cmd. |
| Comments |
| Comment by Jason Chan [ 26/May/20 ] |
|
We closed this since we expect |
| Comment by Suganthi Mani [ 18/May/20 ] |
|
FYI, awaitReplication is called as part of the waitForReplication background hook. Also, we call full collection validation by default when the node shuts down. Another alternative is that secondary backup node shutdown can call rst.stop method using skipValidation: true which will skip collection validation. This will assure we don't call collection validation when fsm workload and WaitForReplication is in progress. |
| Comment by Jason Chan [ 18/May/20 ] |
|
suganthi.mani points out that another alternative is to switch to use a background validation in these tests, but this would require the changes to be made in |