Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45707

Test that range deletion tasks are eventually deleted even if collection is dropped before migration coordination is resumed

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.3.4
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Fully Compatible
    • Sharding 2020-02-10

      When a replica set node transitions to primary, it launches an asynchronous task to resume coordinating migrations whose state was checkpointed  in a document in config.migrationCoordinators.

      If the outcome of the migration (committed or aborted) had not been checkpointed, the node uses the _configsvrEnsureChunkVersionIsGreaterThan command against the config server to guarantee that the migration's decision will not change after that command executes, then forces a filtering metadata refresh and checks if the min key of the migrating range still belongs to the local shard.

      As part of this check, the node checks that the filtering metadata's UUID matches the UUID in the config.migrationCoordinators doc, and if they do not match, cleans up the range deletion tasks on itself and the recipient as well as the migrationCoordinator doc itself.

      This ticket is to add a test that that cleanup does occur, by doing something like inducing a failover on the donor shard, adding a failpoint to hang in the task before doing the filtering metadata refresh, dropping the collection from the cluster while the failpoint is active, then unsetting the failpoint and asserting that the range deletion tasks and migrationCoordinator doc are eventually deleted.

            Assignee:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Reporter:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: