Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.3.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Workload Resilience
Backwards Compatibility:
Fully Compatible
Sprint:
Workload Resilience 2025-12-22
Linked BF Score:
200
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In SPM-4003, we introduced a new passthrough suite that simulates the ingress request rate limiter rejecting a high percentage of requests in order to test the server's load shedding retry behavior. One of these passthroughs is derived from sharding_jscore_passthrough_with_balancer, which causes random rebalancing to occur during tests. Sometimes, the balancer kicks in and takes a long time to complete, since many of its requests are rejected. This may cause a large batched write to not make progress for several rounds, since it cannot refresh routing info if resharding is holding a lock. See ~~SERVER-92228~~ and BF-34032 for a discussion of a similar issue encountered before. As a temporary workaround to solve the attached BF, we should increase the number of no-progress rounds.

e.g.

{"t":{"$date":"2025-12-19T19:59:53.739+00:00"},"s":"D4", "c":"SHARDING", "id":22907,   "svc":"R", "ctx":"conn29","msg":"Write results received","attr":{"shardInfo":"localhost:20002","status":"ShardCannotRefreshDueToLocksHeld{ nss: \"test.system.resharding.76b85189-0545-4298-bbdf-a4466d32b73e\" }: Routing info refresh did not complete"}}
{"t":{"$date":"2025-12-19T19:59:53.745+00:00"},"s":"D4", "c":"SHARDING", "id":9986810, "svc":"R", "ctx":"conn29","msg":"Completed round","attr":{"rounds completed":2}}
{"t":{"$date":"2025-12-19T19:59:53.745+00:00"},"s":"D5", "c":"SHARDING", "id":9986809, "svc":"R", "ctx":"conn29","msg":"No progress made this round","attr":{"num rounds without progress":1}}

is related to

SERVER-92228 Revisit the default value of max number of no progress before aborting batch write.

Closed

related to

SERVER-115873 Use default maxRoundsWithoutProgressParameter in rate limited sharding with balancer passthrough

Backlog

Assignee:: Patrick Freed
Reporter:: Patrick Freed
Participants:: Githook User, Patrick Freed
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Dec 19 2025 08:43:57 PM UTC
Updated:: Dec 19 2025 10:37:55 PM UTC
Resolved:: Dec 19 2025 10:36:17 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates