Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82353

Multi-document transactions can miss documents when movePrimary runs concurrently

    • Catalog and Routing
    • Fully Compatible
    • v7.2, v7.0, v6.0, v5.0, v4.4
    • CAR Team 2023-11-13, CAR Team 2023-11-27, CAR Team 2023-12-11, CAR Team 2023-12-25, CAR Team 2024-01-08, CAR Team 2024-01-22, CAR Team 2024-02-05
    • 101
    • 3

       

      Issue Status as of April 15 2024

      This issue is included in MongoDB System Alert: Sharded multi-document transactions may perform operations using inconsistent sharding metadata. The information below describes only the behavior and impact related to SERVER-82353. Please see the consolidated issue page for guidance on identifying if you are impacted by these issues and remediation.

      SUMMARY

      Operations within a multi-document transaction may not correctly read, or be applied to documents if they belong to unsharded collections within a database that is migrating between shards as part of a movePrimary operation:

      • Reads may return incomplete results
      • Update or delete operations may not be applied to documents

      ISSUE DESCRIPTION AND IMPACT
      During the course of a multi-document transaction on an unsharded collection, and concurrent movePrimary operations on the same collection, a portion of the unsharded collection may not be visible to the transaction.

      This behavior affects multi-document transactions using a Read Concern of 'local' (default for reads), 'majority', or 'snapshot'.

      The issue affects MongoDB versions:

      • MongoDB 4.4.0 through 4.4.28
      • MongoDB 5.0.0 through 5.0.24
      • MongoDB 6.0.0 through 6.0.13
      • MongoDB 7.0.0 through 7.0.5
      • MongoDB Rapid Release 7.1.0, 7.1.1, 7.2.0

      The minimum conditions for the issue to manifest (all must be met) are:

      • Sharded cluster with more than one shard;
      • A movePrimary command was issued concurrently with...
      • A workload which uses either of:
        • A multi-statement transaction, that:
          • Runs at local, majority, or snapshot read concern, and
          • Performs operations on unsharded collections being moved by the movePrimary command;
        • Queryable Encryption
          • Where the database being migrated contains collections which have one or more encrypted fields

      This issue occurs because, under these conditions, transactions against unsharded collections concurrently migrated via movePrimary perform operations on an earlier, intermediate-cloned state of the data. When this happens, a multi-document transaction may return partial results or not modify the expected documents.

      The table below describes what types of operations may be impacted, and how:

      What is affected Effect Downstream Effect
      Reads or Writes outside of a transaction None None
      Within a transaction - Reads or Writes to any sharded collection, or unsharded collections outside of a migrating database None None
      Writes to unsharded collections within a migrating database Updates or deletes may miss documents which should be targeted.
       
      Inserts on collections with unique indexes may fail with a duplicate key exception and automatically abort the transaction if they depend on update or delete operations which miss documents.
       
      Writes on newly inserted documents will be correctly applied.
      Application level inconsistencies between documents. No replica set inconsistencies or index inconsistencies.
      Reads from unsharded collections within a migrating database. Possible incomplete results. Application-introduced inconsistencies if reads would prompt additional action.

      WORKAROUND

      If your workload utilizes multi-document transactions on a sharded cluster meeting the criteria above, we recommend that you:

      • Upgrade to MongoDB 5.0.25, MongoDB 6.0.14, MongoDB 7.0.6, or MongoDB Rapid Release 7.2.1
      • See the Remediation section below

      DIAGNOSIS & REMEDIATION

      See MongoDB System Alert: Sharded multi-document transactions may perform operations using inconsistent sharding metadata for guidance on assessing if you are impacted and the recommended remediation steps.

       

      original description

       

      Consider a multi-document transaction with readConcern=snapshot (without atClusterTime provided by the client) involving an unsharded collection, and the following interleaving:

      1. Mongos chooses the 'atClusterTime' at which the transaction will run. Let's say it choses TS100.
      2. Concurrently, a movePrimary executes. The recipient finishes cloning documents at TS200, and the operation commits at TS210.
      3. MovePrimary finishes and mongos becomes aware of the new db-primary shard.
      4. Now mongos proceed with routing the transaction statement to the new primary, but with atClusterTime=TS100.
      5. On the shard, the databaseVersion check will pass, but the transaction will execute with a data snapshot @TS100, so it won't see the documents.

      This can cause reads to not see the expected data, and writes to not modify the expected documents.

      Edit: A similar bug can occur with readConcerns other than snapshot. For instance, consider initially shard1 owns dbA, and shard2 owns dbB:
      1. Mongos targets a first transaction statement for dbA to shard1. This opens a snapshot at T100 on that shard.
      2. MovePrimary moves dbB to shard1, which commits at T200.
      3. Mongos targets a second statement for dbB to shard1. DatabaseVersion check passes, but the snapshot used by the transaction on shard1 does not contain the expected data for dbB.

            Assignee:
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            Reporter:
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            Votes:
            0 Vote for this issue
            Watchers:
            23 Start watching this issue

              Created:
              Updated:
              Resolved: