It is possible due to a mirrored read or an earlier failed resharding operation for a secondary to be aware of the temporary resharding namespace and to believe the collection is unsharded. A primary is guaranteed to have refreshed its CatalogCache after the config.chunks entries have been written on the config server primary. However, there is no equivalent guarantee for secondaries. This can lead the call to CatalogCache::getShardedCollectionRoutingInfo() in the $_internalReshardingOwnershipMatch stage to throw a NamespaceNotSharded exception.
We can instead use CatalogCache::getShardedCollectionRoutingInfoWithRefresh() to ensure the secondary will have refreshed after the config.chunks entries have been written on the config server primary and also know the temporary resharding namespace is sharded.
[js_test:resharding_replicate_updates_as_insert_delete] d20770| 2021-12-06T22:22:48.821+00:00 I SH_REFR 4619902 [CatalogCache-1] "Collection has found to be unsharded after refresh","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","durationMillis":13} [js_test:resharding_replicate_updates_as_insert_delete] d20770| 2021-12-06T22:22:48.821+00:00 I SHARDING 21917 [RecoverRefreshThread] "Marking collection as unsharded","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa"} [js_test:resharding_replicate_updates_as_insert_delete] d20771| 2021-12-06T22:22:48.823+00:00 I SH_REFR 4619902 [CatalogCache-0] "Collection has found to be unsharded after refresh","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","durationMillis":17} [js_test:resharding_replicate_updates_as_insert_delete] d20771| 2021-12-06T22:22:48.823+00:00 I SHARDING 21917 [RecoverRefreshThread] "Marking collection as unsharded","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa"} ... [js_test:resharding_replicate_updates_as_insert_delete] d20772| 2021-12-06T22:22:49.630+00:00 E RESHARD 5352400 [ReshardingRecipientService-0] "Operation-fatal error for resharding while cloning sharded collection","attr":{"sourceNamespace":"test.foo","outputNamespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","readTimestamp":{"$timestamp":{"t":1638829369,"i":4}},"error":"NamespaceNotSharded: Error on remote shard EC2AMAZ-6BUUU1A:20771 :: caused by :: Executor error during getMore :: caused by :: Expected collection test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa to be sharded"}
[js_test:setfcv_reshard_collection] d20277| 2021-12-02T12:55:55.323+00:00 I COMMAND 20332 [ReplWriterWorker-1] "CMD: drop","attr":{"namespace":"config.cache.chunks.reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490"} [js_test:setfcv_reshard_collection] d20277| 2021-12-02T12:55:55.325+00:00 I SH_REFR 4619902 [CatalogCache-0] "Collection has found to be unsharded after refresh","attr":{"namespace":"reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490","durationMillis":99} ... [js_test:setfcv_reshard_collection] d20276| 2021-12-02T12:55:55.834+00:00 E RESHARD 5352400 [ReshardingRecipientService-2] "Operation-fatal error for resharding while cloning sharded collection","attr":{"sourceNamespace":"reshardingDb.testColl","outputNamespace":"reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490","readTimestamp":{"$timestamp":{"t":1638449755,"i":81}},"error":"NamespaceNotSharded: Error on remote shard ip-10-122-57-106.ec2.internal:20277 :: caused by :: Executor error during getMore :: caused by :: Expected collection reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490 to be sharded"}
- is caused by
-
SERVER-60860 ReshardingCollectionCloner uses primary read preference when nearest was intended
- Closed