Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32593

CSRS stepdown during migration commit can trigger fassert on source shard primary

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.3, 3.7.2
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
    • Fully Compatible
    • ALL
    • v3.6
    • Sharding 2018-01-15, Sharding 2018-01-29, Sharding 2018-02-12
    • 0

      During the critical section, the source shard sends _configsvrCommitChunkMigration to the CSRS primary, but this can fail if the primary recently stepped down, causing the source shard to try to log "moveChunk.validating" on the CSRS primary to update its optime before refreshing metadata, and if this also fails, the source shard will fassert.

      From this comment, it seems that this is desired behavior, but it's a problem for the continuous stepdowns concurrency suite with the balancer enabled, since background migrations can crash servers and fail the test when the cluster is torn down.

      Example failure: https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_concurrency_sharded_with_stepdowns_and_balancer_WT_patch_b8f64cc3fde6d041f3e90b1cb2e153b0b15f6c47_5a20e4efe3c33173de00c68d_17_12_01_05_13_55

            jack.mulrow@mongodb.com Jack Mulrow
            jack.mulrow@mongodb.com Jack Mulrow
            0 Vote for this issue
            6 Start watching this issue