[SERVER-57990] Make ReshardingDonorServiceTest and ReshardingRecipientServiceTest stepdown test cases more realistic Created: 22/Jun/21  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Haley Connelly Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: PM-234, cs-subteam1, sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Participants:

 Description   

Currently, the resharding_donor_service_test.cpp and resharding_recipient_service_test.cpp have stepdown tests that rely on mocking the coordinator's state changes to make progress.

Issue: The loop iterating over which states to pause and then stepdown upon mocks incorrect behavior - it can mock a coordinator's transition to kBlockingWrites before the donor itself is in kDonatingOplogEntries.

Example:
. state = DonorStateEnum::kDonatingOplogEntries  in the test loop

. at the start of this iteration, ReshardingDonorDocument.state = kDonatingInitialData on disk

. the test mocks the coordinator's transition into kApplying
. the donor finishes up all the necessary work and is ready to transition to 'state' kDonatingOplogEntries

. the test waits until the OpObserverForTest::onUpdate witnesses attempt to transition to kDonatingOplogEntries
. the test calls stepDown, causing the OpObserverForTest::onUpdate to throw when the opCtx is interrupted and the write ReshardingDonorDocument.state = kDonatingOplogEntries fails

 . (next iteration) state = DonorStateEnum::kBlockingWrites, but ReshardingDonorDocument.state = kDonatingInitialData still
. The test mocks the coordinator's transition to kBlockingWrites, before the donor is in kDonatingOplogEntries, which is illegal in the real system

Note:
We want to preserve the behavior that the stepdown occurs before the participant persists its new state to its local ReshardingDocument.


Generated at Thu Feb 08 05:43:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.