[SERVER-58389] Capture NetworkInterfaceExceededTimeLimit and MaxTimeMSExpired errors in resharding participants Created: 09/Jul/21  Updated: 29/Oct/23  Resolved: 04/Aug/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.0.3, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Blake Oler Assignee: Matthew Walak (Inactive)
Resolution: Fixed Votes: 0
Labels: PM-234-M3, PM-234-T-autocommits
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-79771 Make Resharding Operation Resilient t... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: Sharding 2021-07-26, Sharding 2021-08-09
Participants:
Story Points: 1

 Description   

In resharding, shards call into the config server in order to update the coordinator document (donor, recipient). NetworkInterfaceExceededTimeLimit and MaxTimeMSExpired errors are not considered retriable, but are definitely reachable – these commands have a timeout of 30 seconds, and one of the listed errors will be thrown if the timeout is reached. These errors will escape any command retrying and resharding-specific transient error retrying, and will ultimately cause an fassert on whatever node is running resharding.

The solution here is to figure out the best place to swallow and retry these errors.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 10/Aug/21 ]

Author:

{'name': 'Matt Walak', 'email': 'matt.walak@mongodb.com'}

Message: SERVER-58389 Removed $maxTimeMS for updates performed by shards on the config server during a resharding operation
Branch: v5.0
https://github.com/mongodb/mongo/commit/54c7e59e3b9d498fff915026c2d7abc8db2a83f7

Comment by Githook User [ 04/Aug/21 ]

Author:

{'name': 'Matt Walak', 'email': 'matt.walak@mongodb.com'}

Message: SERVER-58389 Removed $maxTimeMS for updates performed by shards on the config server during a resharding operation
Branch: master
https://github.com/mongodb/mongo/commit/81d03ac5f38ca6dd7b833445a7c09bf1a8d2284a

Comment by Max Hirschhorn [ 14/Jul/21 ]

I think we should remove the $maxTimeMS for the updates that shards perform during a resharding operation on the config server.

It still doesn't make sense to me why sharding code imposes a $maxTimeMS of anything other than the remaining time of the user-supplied $maxTimeMS (which in resharding's case is infinite time). CC kaloian.manassiev

Generated at Thu Feb 08 05:44:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.