Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61985

resharding_coordinator_recovers_abort_decision.js may report resharding operation as succeeding due to primary shard retrying _configsvrReshardCollection and running a second resharding operation

    • Fully Compatible
    • ALL
    • v6.0, v5.0
    • Sharding 2022-06-27, Sharding 2022-07-11
    • 163
    • 3

      The ReshardingTest fixture configures the reshardingPauseCoordinatorBeforeCompletion with {times: 1} which means that it is automatically disabled once it is reached by a ReshardingCoordinator. The failpoint is automatically disabled once it has been reached and therefore won't actually pause the ReshardingCoordinator. This is problematic for cases where the reshardCollection command is expected to error (i.e. tests which use expectedErrorCode !== ErrorCodes.OK) because the _configsvrReshardCollection can be retried by the primary shard and will have forgotten about an earlier aborted resharding. This can lead an entire second resharding operation to run and, because it runs entirely after the duringReshardingFn finished executing, it won't also abort like the first resharding operation.

      We should revert the changes to the ReshardingTest fixture from 38c6aff as part of SERVER-52730 so the ReshardingCoordinator remains paused. This will require devising a different solution to not having the resharding_prohibited_commands.js test running a second reshardCollection command get stuck, which can likely be done by passing data into the reshardingPauseCoordinatorBeforeCompletion failpoint to only pause the ReshardingCoordinator for a particular source namespace.

      We should also revert the test changes to resharding_nonblocking_coordinator_rebuild.js from SERVER-61607 because I hadn't realized the problematic behavior with the reshardingPauseCoordinatorBeforeCompletion failpoint being the culprit until now.

            abdul.qadeer@mongodb.com Abdul Qadeer
            max.hirschhorn@mongodb.com Max Hirschhorn
            0 Vote for this issue
            4 Start watching this issue