Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59023

Resharding can fail with NamespaceNotSharded following a primary failover on a recipient shard

    • Fully Compatible
    • ALL
    • v5.0
    • Sharding 2021-08-09, Sharding 2021-08-23
    • 2

      A secondary which steps up to be primary won't necessarily have loaded the collection metadata for temporary resharding collection. This can lead assertCanExtractShardKeyFromDocs() to throw a NamespaceNotSharded exception when the ReshardingOplogApplier attempts to write to the temporary resharding collection.

      The ReshardingCollectionCloner and ReshardingOplogApplier rely on assertCanExtractShardKeyFromDocs() to ensure an update doesn't write an invalid shard key value (e.g. an array value) under the new shard key pattern. We must either

      • (a) ensure the collection metadata for the temporary resharding collection has been loaded before the ReshardingCollectionCloner or ReshardingOplogApplier attempt to write to it, or
      • (b) retry in ReshardingCollectionCloner and ReshardingOplogApplier on the exception thrown by assertCanExtractShardKeyFromDocs(), or
      • (c) move the checks into ReshardingCollectionCloner and ReshardingOplogApplier directly, and allow unversioned (direct) writes to the temporary resharding collection if they are being performed by an internal (system) Client.
      const auto metadata = CollectionShardingRuntime::get(opCtx, nss)->getCurrentMetadataIfKnown();
      // A user can manually create a 'db.system.resharding.' collection that isn't guaranteed to be
      // sharded outside of running reshardCollection.
              str::stream() << "Temporary resharding collection " << nss.toString()
                            << " is not sharded",
              metadata && metadata->isSharded());

            max.hirschhorn@mongodb.com Max Hirschhorn
            max.hirschhorn@mongodb.com Max Hirschhorn
            0 Vote for this issue
            2 Start watching this issue