-
Type:
Task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
Fully Compatible
-
ClusterScalability 16Feb-2Mar
-
0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The resharding failover workload often times out due to repeated interruptions:
[2026/02/15 09:32:13.002] [info ] [locust.locust-dsi_secondary_0_3-147-82-251] [2026-02-15 14:32:12,996] localhost/WARNING/user: Attempt 105 failed: operation was interrupted, full error: {'ok': 0.0, 'errmsg': 'operation was interrupted', 'code': 6, 'codeName': 'HostUnreachable', '$clusterTime': {'clusterTime': Timestamp(1771165932, 3309), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}, 'operationTime': Timestamp(1771165932, 3309)}. Retrying in 1s...
Shard stepdowns are currently configured to occur every 15 seconds, and config stepdowns every 40 seconds. Consider reducing the shard stepdown frequency to allow the resharding operation to make forward progress. Alternatively, investigate whether the workload should use force: true with the replSetStepDown command, as this may explain the increased number of HostUnreachable errors.