[SERVER-46420] Fix move_jumbo_chunk.js to not require disabling the resumable range deleter Created: 26/Feb/20  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: sharding-wfbf-sprint
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-46230 Update tests to work with resumable r... Closed
Assigned Teams:
Cluster Scalability
Sprint: Sharding 2020-12-14, Sharding 2020-12-28, Sharding 2021-01-11, Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08, Sharding 2021-03-22, Sharding 2021-04-05, Sharding 2021-04-19, Sharding 2021-05-03
Participants:

 Description   

This test requires the resumable range deleter to be disabled in order to run. This is because it sets a failpoint to make the recipient shard hang, which means that advanceTransactionOnRecipient will hang on the recipient trying to check out the session, which will prevent the migration from completing.

Unfortunately this is hard to fix in this test because we're using the balancer to do the migration. In addition, we can't avoid setting the migrateThreadHangAtStep2 failpoint because then this call to the recipient could yield Status::OK(), bypassing the error we're trying to hit in the test.

I'm not sure what a good solution is, but fixing this would be required to remove the disableResumableRangeDeleter flag in 4.6.

As a side note for whoever looks at this later, it is not necessary to test three rounds of the balancer and in fact this behavior is incorrect - the first time it will fail due to ExceededMemoryLimit as expected, but the second two migration attempts fail due to ConflictingOperationInProgress since the recipient remains hung at the failpoint even after the first migration fails.



 Comments   
Comment by Haley Connelly [ 01/Apr/21 ]

Determined that this belongs in the wfbf sprint since there isn't a clear solution.  

Generated at Thu Feb 08 05:11:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.