ReshardingCoordinator unit test toggles allowMigrations without bumping the shard version

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • CAR Team 2026-05-11
    • 0
    • 🟥 DDL
    • None
    • None
    • None
    • None
    • None
    • None

      The test case 'CausalityBarrierInvokedOnRecovery' is hitting the tassert 7032310 the test-fixture mock resumeMigrations is flipping the allowMigrations flag no placement-version bump, unlike the production path which calls setAllowMigrationsAndBumpOneChunk.

      Steps to failure

      After resharding commits, the first iteration of the post-commit WithAutomaticRetry block runs_resumeMigrations (which the mock applies as a no-bump write), then a NotWritablePrimary error from a fire-and-forget command causes the block to retry; on the second iteration, _updateChunkImbalanceMetrics calls getCollectionPlacementInfoWithRefresh on db.foo, and the catalog cache sees allowMigrations flip false→true at the same placement version, triggering the tassert at catalog_cache.cpp.

      Why it started failing now
      The race has been latent in the mock since the resumeMigrations stub was written, but it seems it became reachable after SERVER-123498 introduced CausalityBarrierInvokedOnRecovery, which is the first test to run resharding through a full step-down/step-up cycle and then all the way to completion.

            Assignee:
            Silvia Surroca
            Reporter:
            Silvia Surroca
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: