Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62720

_configsvrReshardCollection can fail to join existing operation

    • Fully Compatible
    • ALL
    • v6.1, v6.0, v5.0
    • Sharding 2022-08-08, Sharding 2022-08-22, Sharding 2022-09-05, Sharding 2022-09-19, Sharding 2022-10-03
    • 60
    • 4

      Currently, the _configsvrReshardCollection relies on its own getExistingInstanceToJoin() function to determine if an existing ReshardingCoordinator is executing the same resharding operation. Because this function relies on getting and iterating existing instances of ReshardingCoordinator instead of checking atomically by overriding checkIfConflictsWithOtherInstances(), it's possible that if two identical _configsvrReshardCollection commands execute in quick succession (e.g. due to an election on the primary shard for the database) that they both see no existing coordinators and proceed to create one.

      As seen in BF-23979, this will manifest as a Location5808201 error for the coordinator that loses the race to set allowMigrations to false.

            Assignee:
            abdul.qadeer@mongodb.com Abdul Qadeer
            Reporter:
            brett.nawrocki@mongodb.com Brett Nawrocki
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: