Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32592

Stepdown during migration cleanup can crash the source shard primary

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.3, 3.7.2
    • Affects Version/s: 3.6.2
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • v3.6
    • Sharding 2018-01-29, Sharding 2018-02-12
    • 0

      If the source shard primary steps down during a migration, it can trigger one of the cleanupOnError scope guards in the MigrationSourceManager. This calls MigrationManager::_cleanup, which can call ShardServerCatalogCacheLoader::waitForCollectionFlush, which will uassert if the node is no longer primary, and because this was called inside a scope guard and isn't caught, the exception triggers std::terminate() and crashes the server.

      Example failure: https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_coverage_concurrency_sharded_with_stepdowns_and_balancer_patch_e4ba7722773f68d42a66af7439e585cc2136d003_5a4ceaabe3c3316388000020_18_01_03_14_41_42/0

            Assignee:
            jack.mulrow@mongodb.com Jack Mulrow
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: