-
Type: Bug
-
Resolution: Fixed
-
Priority: Critical - P2
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
v6.0, v5.0
-
Sharding 2022-08-08, Sharding 2022-08-22
-
144
-
3
While a resharding operation is ongoing, every write to the collection being resharded is amplified to a "db.system.resharding.<uuid>" sharded collection (aka temporary resharding collection). This is in part achieved by having ShardingWriteRouter::getReshardingDestinedRecipient() fill in a "destinedRecipient" field into the oplog entries for the inserts, updates, and deletes on the collection being resharded so these oplog entries can be later fetched by the appropriate recipient shard. ShardingWriteRouter calls CatalogCache::getCollectionRoutingInfo() to make this routing decision rather than CatalogCache::getCollectionRoutingInfoWithRefresh(). This is safe if the primary of the donor shard hasn't changed because it will have already refreshed the routing information for the temporary resharding collection earlier. However, if a new primary of the donor shard has been elected then the routing information for the temporary resharding collection may be arbitrarily stale. The routing information being stale is problematic for a couple reasons:
- If the routing information for the temporary resharding collection says the collection is unsharded, then ShardingWriteRouter calling ChunkManager::findIntersectingChunkWithSimpleCollation() will result in a segmentation fault.
- If the routing information for the temporary resharding collection represents the chunk distribution from a prior resharding attempt, then the recipient shards may miss applying oplog entries and not end up consistent with the collection being resharded.
Running the flushRouterConfig command on all mongod --shardsvr processes before re-attempting a failed resharding operation will prevent the routing information for the temporary resharding collection from being stale.