Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32592

Stepdown during migration cleanup can crash the source shard primary

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.6.2
    • Fix Version/s: 3.6.3, 3.7.2
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v3.6
    • Sprint:
      Sharding 2018-01-29, Sharding 2018-02-12
    • Linked BF Score:
      0

      Description

      If the source shard primary steps down during a migration, it can trigger one of the cleanupOnError scope guards in the MigrationSourceManager. This calls MigrationManager::_cleanup, which can call ShardServerCatalogCacheLoader::waitForCollectionFlush, which will uassert if the node is no longer primary, and because this was called inside a scope guard and isn't caught, the exception triggers std::terminate() and crashes the server.

      Example failure: https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_coverage_concurrency_sharded_with_stepdowns_and_balancer_patch_e4ba7722773f68d42a66af7439e585cc2136d003_5a4ceaabe3c3316388000020_18_01_03_14_41_42/0

        Attachments

          Activity

            People

            Assignee:
            jack.mulrow Jack Mulrow
            Reporter:
            jack.mulrow Jack Mulrow
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: