Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.2.0-rc0
Affects Version/s: None
Component/s: None
Labels:
- resharding-success-rate-improvements

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Sprint:
ClusterScalability Jun9-Jun23
Linked BF Score:
200
Confidence Status:
None
Work Order:
3
Size Category:
TBD
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Estimated Weeks:
0

Currently:

Upon transitioning to the “cloning” state, the ReshardingOplogFetcher on each recipient starts running aggregate commands to fetch oplog entries with readPreference “nearest”
~~SERVER-103554~~ Upon transitioning to the “blocking-writes” state, the coordinator tells each recipient to refresh the shard version. Previously, it used to only do that for donors. The refresh causes each recipient to notify its ReshardingOplogFetcher to interrupt any in-progress aggregate/getMore command and start the new ones with readPreference “primary”.

From the latest testing results, that is not sufficient.

If a recipient has been fetching oplog entries from stale secondaries. The remaining time estimate might be very off. For example, when the estimate is performed, the numbers of oplog entries fetched and applied are both 3000 (so the estimated remaining time is 0). But the total number of entries to fetch is 5000. The recipient may not finish fetching and applying the remaining 2000 within the critical section timeout of 5 seconds.
~~SERVER-104303~~ does not solve this problem because it only accounts for majority replication lag (since the writes to enter or exit the critical section use w: majority not w: all), and the recipient could be fetching from a secondary that is not part of the majority.

The proposal is as follows. During the “applying” state, if the remaining time estimate calculated as part of a _shardsvrReshardingOperationTime command is less than some threshold (configurable via sever parameter), the ReshardingOplogFetcher would start using readPreference “primary” for new aggregate commands. Similar to in ~~SERVER-103554~~, any in-progress aggregation (aggregate/getMore command) would be interrupted.

is related to

SERVER-103554 Make ReshardingOplogFetcher fetch oplog entries from the primary during the critical section

Closed

SERVER-104303 Replication lag on resharding donors can lead to critical section timeout

Closed

related to

SERVER-105915 resharding_critical_section_repl_lag.js recipient can target the lagged donor secondary causing the test to hang

Closed

SERVER-106341 Fix race condition in ReshardingOplogFetcherTest

In Progress

Assignee:: Cheahuychou Mao
Reporter:: Cheahuychou Mao
Participants:: Cheahuychou Mao, Githook User
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Jun 03 2025 03:19:30 PM UTC
Updated:: Jun 20 2025 07:52:56 PM UTC
Resolved:: Jun 03 2025 08:47:30 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates