Race condition between listDatabases and movePrimary

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • ALL
    • Hide

      You'll find a reproducible pushed on commit 8d2a5e9.

      Show
      You'll find a reproducible pushed on commit 8d2a5e9 .
    • CAR Team 2026-01-19, CAR Team 2026-02-02, CAR Team 2026-02-16, CAR Team 2026-03-02, CAR Team 2026-03-16
    • 🟩 Routing and Topology
    • None
    • None
    • None
    • None
    • None
    • None

      On a sharded cluster, listDatabases can miss the database being moved through a movePrimary when the movePrimary operation runs concurrently with listDatabases.

      Root cause

      When listDatabases is executed on a sharded cluster, the mongoS queries all shards sequentially and merges results causing the following race condition scenario:

      • If listDatabases queries the recipient shard before movePrimary completes, the recipient doesn't have the database yet.
      • If listDatabases queries the donor shard after movePrimary completes, the donor will not report the database.
        As a result, the database being moved may be missing from the listDatabases response.

      Annotations

      • This issue can also occur with concurrent moveChunk or moveCollection operations in the following scenarios:
      • When moveChunk moves an entire sharded collection and this collection is the only one for its database.
      • When moveCollection moves a collection that is the only one for its database.
      • The race condition with movePrimary can only happen when all the collections of the database being moved are untracked. This is because tracked collections are not moved along with a movePrimary operation.

            Assignee:
            Igor Praznik
            Reporter:
            Silvia Surroca
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: