[SERVER-44249] Shard can continue to accept reads for a dropped collection if shard failover happens during setShardVersion sent at end of dropCollection Created: 25/Oct/19  Updated: 27/Oct/23  Resolved: 09/Jul/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Randolph Tan
Resolution: Gone away Votes: 0
Labels: sharding-DDL-bugs, sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-34061 Stop preemptively loading sharded col... Closed
Operating System: ALL
Sprint: Sharding 2020-07-13
Participants:
Linked BF Score: 19

 Description   

The config server can target an old shard primary (that has yet to find out that there is a new shard primary) for setShardVersion.

The old shard primary can step down while executing the setShardVersion and continue executing setShardVersion as a secondary as long as the stepdown occurs before here, because stepdown only kills operations that have taken a MODE_IX, MODE_S, or MODE_X lock. The old shard primary will then send _flushRoutingTableCacheUpdates to the new shard primary. If the new shard primary does not have an entry for the database in its CatalogCache, it will not schedule a collection refresh against the ShardServerCatalogCacheLoader, so the config.cache* entries for the dropped collection will not get deleted.

After the _flushRoutingTableCacheUpdates returns, the old shard primary (now secondary) will think its filtering table is up to date and will continue to accept reads for the dropped collection (even if the collection has been recreated elsewhere), which violates causal consistency.



 Comments   
Comment by Randolph Tan [ 09/Jul/20 ]

After SERVER-34061, we no longer preload all sharded collections when loading the db object. Instead of invalidating the collection entry (which will be skipped before if it didn't exist), we also changed the CatalogCache::_getCollectionRoutingInfoWithForcedRefresh to always create new collection entries with refreshNeeded = true. Therefore, when the _flushRoutingTableCacheUpdates is run, it should properly cleanup the config.cache.* when it tries to refresh the collection.

Comment by Esha Maharishi (Inactive) [ 11/May/20 ]

Bringing this into sharding-wfbf-day since the failure has been happening intermittently in Evergreen.

We could fix this by making setShardVersion take a global IX lock at the beginning of the command and check for being primary under it. That way if the node steps down during the setShardVersion, the node will return InterruptedDueToReplStateChange and the config server should retry the setShardVersion against the new primary.

Generated at Thu Feb 08 05:05:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.