[SERVER-44249] Shard can continue to accept reads for a dropped collection if shard failover happens during setShardVersion sent at end of dropCollection Created: 25/Oct/19 Updated: 27/Oct/23 Resolved: 09/Jul/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Esha Maharishi (Inactive) | Assignee: | Randolph Tan |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | sharding-DDL-bugs, sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Sharding 2020-07-13 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 19 | ||||||||||||
| Description |
|
The config server can target an old shard primary (that has yet to find out that there is a new shard primary) for setShardVersion. The old shard primary can step down while executing the setShardVersion and continue executing setShardVersion as a secondary as long as the stepdown occurs before here, because stepdown only kills operations that have taken a MODE_IX, MODE_S, or MODE_X lock. The old shard primary will then send _flushRoutingTableCacheUpdates to the new shard primary. If the new shard primary does not have an entry for the database in its CatalogCache, it will not schedule a collection refresh against the ShardServerCatalogCacheLoader, so the config.cache* entries for the dropped collection will not get deleted. After the _flushRoutingTableCacheUpdates returns, the old shard primary (now secondary) will think its filtering table is up to date and will continue to accept reads for the dropped collection (even if the collection has been recreated elsewhere), which violates causal consistency. |
| Comments |
| Comment by Randolph Tan [ 09/Jul/20 ] |
|
After |
| Comment by Esha Maharishi (Inactive) [ 11/May/20 ] |
|
Bringing this into sharding-wfbf-day since the failure has been happening intermittently in Evergreen. We could fix this by making setShardVersion take a global IX lock at the beginning of the command and check for being primary under it. That way if the node steps down during the setShardVersion, the node will return InterruptedDueToReplStateChange and the config server should retry the setShardVersion against the new primary. |