Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-68046

Investigate handling of inconsistent collection UUIDs between shards

    • Type: Icon: Task Task
    • Resolution: Gone away
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Sharding EMEA
    • Sharding EMEA 2023-07-10

      It has been observed in some user clusters the existence of mismatching UUIDs for the same collection on different shards. It is following explained what could have caused the issue and which are the consequences. Purpose of this ticket is to track the issue and decide if/how to handle the mismatch.

      CAUSE

      The most probable reason behind the mismatch is for a client to have used direct connections to shards: when routers are bypassed shards are addressed as plain replicasets, so collection creation is handled assuming the node is not part of a cluster. This results in the collection being assigned a random UUID and treated as unsharded (but potentially residing on more shards rather than on the primary for its db).

      CONSEQUENCES

      • Case 1: collection is unsharded
        • Queries passing through routers will only target the primary shard, so data present on all other shards will be not taken into consideration.
        • Queries passing through direct connections will only target a specific shard, ignoring all data contained in other shards.
      • Case 2: collection is sharded
        • Queries passing through routers only target shards according to the routing table. All data present on shards where the collection has a different UUIDs and no chunk will not be taken into consideration because no migration could have possibly happen on such shards given the metadata mismatch.
        • Queries passing through direct connections will only target a specific shard, ignoring all data contained in other shards
        • As of v3.6, no migrations can target shards with mismatching UUIDs for a collection (SERVER-31909)

      RESOLUTION

      There is not a standard solution since users need to decide what do to with the data that were inserted via direct connections. The resolution strategy is very dependent on the use case: can data be simply copied into the sharded collection? What if some documents turn out to be duplicated? Can it be acceptable to simply drop the "mismatching" collections and wipe out their data?

      NOT A RESOLUTION

      Simply changing the collection UUIDs in order to match the metadata tracked in the sharding catalog would mean magically creating orphaned documents. Documents hosted on originally mismatching shards would exist on the wrong shards with some bad consequences:

      • They would not be taken into consideration due to the routing protocol
      • They could be wrong/duplicated in case a chunk covering their shard key would be moved to the shard

            Assignee:
            tommaso.tocci@mongodb.com Tommaso Tocci
            Reporter:
            pierlauro.sciarelli@mongodb.com Pierlauro Sciarelli
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: