Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61945

Resharding collection cloning may fail with NamespaceNotSharded when "nearest" read preference chooses secondary

    • Fully Compatible
    • ALL
    • v5.2, v5.1, v5.0
    • Sharding 2021-12-13
    • 40
    • 2

      It is possible due to a mirrored read or an earlier failed resharding operation for a secondary to be aware of the temporary resharding namespace and to believe the collection is unsharded. A primary is guaranteed to have refreshed its CatalogCache after the config.chunks entries have been written on the config server primary. However, there is no equivalent guarantee for secondaries. This can lead the call to CatalogCache::getShardedCollectionRoutingInfo() in the $_internalReshardingOwnershipMatch stage to throw a NamespaceNotSharded exception.

      We can instead use CatalogCache::getShardedCollectionRoutingInfoWithRefresh() to ensure the secondary will have refreshed after the config.chunks entries have been written on the config server primary and also know the temporary resharding namespace is sharded.

      [js_test:resharding_replicate_updates_as_insert_delete] d20770| 2021-12-06T22:22:48.821+00:00 I  SH_REFR  4619902 [CatalogCache-1] "Collection has found to be unsharded after refresh","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","durationMillis":13}
      [js_test:resharding_replicate_updates_as_insert_delete] d20770| 2021-12-06T22:22:48.821+00:00 I  SHARDING 21917   [RecoverRefreshThread] "Marking collection as unsharded","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa"}
      [js_test:resharding_replicate_updates_as_insert_delete] d20771| 2021-12-06T22:22:48.823+00:00 I  SH_REFR  4619902 [CatalogCache-0] "Collection has found to be unsharded after refresh","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","durationMillis":17}
      [js_test:resharding_replicate_updates_as_insert_delete] d20771| 2021-12-06T22:22:48.823+00:00 I  SHARDING 21917   [RecoverRefreshThread] "Marking collection as unsharded","attr":{"namespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa"}
      ...
      [js_test:resharding_replicate_updates_as_insert_delete] d20772| 2021-12-06T22:22:49.630+00:00 E  RESHARD  5352400 [ReshardingRecipientService-0] "Operation-fatal error for resharding while cloning sharded collection","attr":{"sourceNamespace":"test.foo","outputNamespace":"test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa","readTimestamp":{"$timestamp":{"t":1638829369,"i":4}},"error":"NamespaceNotSharded: Error on remote shard EC2AMAZ-6BUUU1A:20771 :: caused by :: Executor error during getMore :: caused by :: Expected collection test.system.resharding.bce940ec-7251-4fc2-9dbd-b45d5614a2aa to be sharded"}
      
      [js_test:setfcv_reshard_collection] d20277| 2021-12-02T12:55:55.323+00:00 I  COMMAND  20332   [ReplWriterWorker-1] "CMD: drop","attr":{"namespace":"config.cache.chunks.reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490"}
      [js_test:setfcv_reshard_collection] d20277| 2021-12-02T12:55:55.325+00:00 I  SH_REFR  4619902 [CatalogCache-0] "Collection has found to be unsharded after refresh","attr":{"namespace":"reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490","durationMillis":99}
      ...
      [js_test:setfcv_reshard_collection] d20276| 2021-12-02T12:55:55.834+00:00 E  RESHARD  5352400 [ReshardingRecipientService-2] "Operation-fatal error for resharding while cloning sharded collection","attr":{"sourceNamespace":"reshardingDb.testColl","outputNamespace":"reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490","readTimestamp":{"$timestamp":{"t":1638449755,"i":81}},"error":"NamespaceNotSharded: Error on remote shard ip-10-122-57-106.ec2.internal:20277 :: caused by :: Executor error during getMore :: caused by :: Expected collection reshardingDb.system.resharding.49b12d21-45eb-4d34-9e8b-22465a75d490 to be sharded"}
      

            Assignee:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: