-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Catalog and Routing
-
ALL
-
CAR Team 2026-02-02
-
0
-
🟦 Shard Catalog
-
None
-
None
-
None
-
None
-
None
-
None
check_metadata_consistency_timout_tests.js might fail with a successfull checkMetadataConsistency execution, when it is expected to fail with a timeout error because a failpoint is being released too early.
Details:
testCMCCommandWithFailpoint is a generic function used by several test cases that:
- Enables a user defined failpoint
- Starts a checkMetadataConsistency thread
- Waits for the failpoint to be hit
- sleeps for a second to ensure any outstanding wait times out
- Disables the failpoint
This might be fine for some of the test scenarios, however, when checking that dbMetadataLockMaxTimeMS does not hide ExceededTimedLimit errors the ExceedingtimedLimit is actually a retriable error, so, if the failpoint is turned off too early, the mongos will retry the _shardsvrCheckMetadataConsistency command, which could end up succeeding when it shouldn't.
The purpose of this ticket is to fix the issue, and wait until all retries from the router fail so the failure is returned back to the driver, and ensure this doesn't happen in any other test case. We could, for example, join the checkMetadataConsistency thread first, and then turn off the failpoint like testCMCCommandWithAsyncDrop does, but we need to make sure all other test cases are correct with this change.
- is caused by
-
SERVER-99440 Add timeout parameter for check metadata consistency database operation
-
- Closed
-