Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43848

find/update/delete w/o shard key predicate under txn with snapshot read can miss documents

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4, v4.2, v4.0
    • Sprint:
      Sharding 2020-03-09
    • Linked BF Score:
      7

      Description

      Scenario:

      shardKey: x: 1
      chunks: [MinKey, 0) @ shard1, [0, MaxKey) @ shard0

      1. Txn sets read concern timestamp to t5.
      2. Migration move document x: 1, y: 1, from shard0 (last chunk) to shard1 at t10.
      3. Mongos refreshes to latest chunk metadata.
      4. Txn targets update/delete with predicate y: 1. This will generate an index bound of [MinKey, MaxKey).
      5. ChunkManager::getShardIdsForRange will go through every chunk that overlaps with [MinKey, MaxKey) and get the shardId at t5.
      6. However, the loop has an optimization to early exit if the number of shards that should be targeted is equal to the shard version map. This will cause the loop to exit early and cause the write to target only shard1.

      The issue here is that the shard version map only include shards with chunks and represents the mapping at t10 and not t5. In the case above, there were 2 shards that had chunks at t5, but only 1 shard that had chunks at t10. Even though the document is currently in shard1, the update/remove will not see it because it is running under the snapshot with ts = t5.

        Attachments

          Activity

            People

            Assignee:
            renctan Randolph Tan
            Reporter:
            renctan Randolph Tan
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: