Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60858

_configsvrReshardCollection command which joins existing ReshardingCoordinator may miss being interrupted on stepdown

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 5.0.0, 5.1.0-rc0
    • 5.2.0, 5.0.4, 5.1.0-rc2
    • Sharding
    • None
    • Fully Compatible
    • ALL
    • v5.1, v5.0
    • Sharding 2021-11-01
    • 1

    Description

      The primary shard for the database running the _shardsvrReshardCollection command will re-send the _configsvrReshardCollection to the config server primary following a retryable error. The new invocation of the _configsvrReshardCollection command will join the existing ReshardingCoordinator instance rather than constructing a new one. However, when this situation occurs, setAlwaysInterruptAtStepDownOrUp() won't have been called on the OperationContext for the _configsvrReshardCollection command. The coordinator document having been written future and the resharding operation completion future aren't guaranteed to become ready with an error on stepdown or shutdown. This leads the _configsvrReshardCollection command to continue running on the config server node after it has stepped down.

      We should call setAlwaysInterruptAtStepDownOrUp() before waiting on these futures so that if the config server primary steps down then the primary shard for the database running the _shardsvrReshardCollection command will re-send the _configsvrReshardCollection to the new config server primary.

      if (auto existingInstance =
              getExistingInstanceToJoin(opCtx, nss, request().getKey())) {
          // Join the existing resharding operation to prevent generating a new resharding
          // instance if the same command is issued consecutively due to client disconnect
          // etc.
          reshardCollectionJoinedExistingOperation.pauseWhileSet(opCtx);
          existingInstance.get()->getCoordinatorDocWrittenFuture().get(opCtx);
          return existingInstance;
      }
      

      Attachments

        Issue Links

          Activity

            People

              max.hirschhorn@mongodb.com Max Hirschhorn
              max.hirschhorn@mongodb.com Max Hirschhorn
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: