Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- MDBCorrect-CA

Assigned Teams:

Catalog and Routing
Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

~~SERVER-104721~~ caught a bug in shardsvr_drop_indexes_command.cpp. Order of events that led to hitting this assertion:

1 . On the first dropIndexes() attempt, we target Shard0 and Shard1.
2. We send requests to both shards, Shard0 returns OK() and is excluded from targeting on the next retry.
3. The shardVersionRetry helper refreshes the CatalogCache, so that we get up-to-date routing tables for Shard1 on the next retry.
4. However, as a chunk migration was happening concurrently in the background, it's possible that the requested range was moved out of Shard1.
5. So, when we call scatterGatherVersionedTargetByRoutingTableNoThrowOnStaleShardVersionErrors again, Shard0 is excluded, and the updated cached routing table tells us that Shard1 doesn't own data for our requested range anymore.
6. Thus, when we build requests to send to the shards, requests is empty (getShardIdsForQuery no longer includes Shard1), so we don't have any responses and don't go through nss validation here.

This is a bug as we now end up in a case where Shard1 has no indexes, but Shard0 does (migrated from Shard1). The operation previously could have reported a success even though the dropIndexes() command wasn't carried out correctly across all shards.

If there are no shardResponses but we've stored shards in shardsWithSuccessResponses, the vector of successful shards to skip on the next retry should be cleared along, and the CatalogCache should be refreshed on the next access.

duplicates

SERVER-91380 Prevent range migrations from recreating an index that is being concurrently dropped

Backlog

is related to

SERVER-104721 Use withValidatedRoutingContext helper in routing operations that call the RoutingContext

Closed

SERVER-104469 Strengthen RoutingContext termination validation in test environments

Closed

related to

SERVER-104953 Complete TODO listed in SERVER-104795

Backlog

Assignee:: Unassigned
Reporter:: Lynne Wang
Participants:: Lynne Wang
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: May 06 2025 05:39:14 PM UTC
Updated:: Dec 03 2025 01:43:30 PM UTC
Resolved:: May 08 2025 09:21:58 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates