-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 9.0.0-rc0
-
Component/s: Catalog, Replication, Sharding
-
None
-
Catalog and Routing
-
CAR Team 2026-05-11
-
0
-
🟦 Shard Catalog
-
None
-
None
-
None
-
None
-
None
-
None
Fix two issues that can cause a tassert or crash if a stepdown happens during authoritative collection metadata refresh:
- If a stepdown interrupts the refresh, the refresh may swallow any error code and keep retrying. Eventually it will hit the retry limit and tassert.
- We should stop retrying if the opCtx has been interrupted.
- On stepdown the replication coordinator interrupts all optime waiters and then invariants that none remain. However we are missing those registered by ReplicationCoordinator::registerWaiterForMajorityReadOpTime (which are used when recovering the authoritative metadata from disk).
- We should also interrupt those optime waiters.