Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15136

Duplicate _ids in production sharded cluster

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
    • Linux

      We're picking up that there are duplicates through logs from the balancer. Please see the example error below (data specific details removed due to sensitive nature).

      balancer move failed: { cause: { active: false, ns: "...", from: "...", min: { customer_id: ObjectId('...'), sk_customer_shard_group: 50 }, max: { customer_id: ObjectId('...'), sk_customer_shard_group: 50 }, shardKeyPattern: { customer_id: 1.0, sk_customer_shard_group: 1.0 }, state: "fail", errmsg: "cannot migrate chunk, local document { _id: ObjectId('...'), account_class: "...", account_id: ObjectId('...", counts: { cloned: 6189, clonedBytes: 26043128, catchup: 0, steady: 0 }, ok: 1.0 }, ok: 0.0, errmsg: "data transfer error" } from: secondset to: firstset chunk: min: { customer_id: ObjectId('...'), sk_customer_shard_group: 50 } max: { customer_id: ObjectId('...'), sk_customer_shard_group: 50 }
      

      Finding by the 'local document' _id returns multiple results. So we have to run a script to de-dup the _id. We're using the C# driver and have recently updated it to the latest sub-version which includes an improvement to ObjectId generation, but the conflicting documents tend to be older data that is only picked up as the balancer moves chunks around.

      I'm not sure how to proceed at this point. But I am scratching my head as to why duplicate _ids are present.

            Assignee:
            Unassigned Unassigned
            Reporter:
            mjduminy Michael Duminy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: