Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16836

Cluster can create the same unsharded collection on more than one shard

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.7, 2.8.0-rc4
    • Component/s: Sharding
    • ALL
    • Hide

      $ mlaunch init --single --sharded 3 --mongos

      run https://gist.github.com/anonymous/8b6783ab067f04e483d6 with 3.0.0-SNAPSHOT Java driver

      This program runs in a loop, on each iteration it inserts 100 documents into a collection, then does a count. If the count is 100, it drops the database and tries again.

      Expected results:

      The count is 100 on every iteration

      Actual results:

      The count usually hovers around 50.

      Analysis:

      The 3.0 Java driver load balances operations across all mongos servers provided in the connection string. So the 100 inserts are distributed roughly evenly across the two mongos servers. What should happen is that the mongos instances will agree on which is the primary shard for the database. But what actually happens is that sometimes they don't, and you can see in the shard primary logs that some of the inserts go to one shard, and some to another. The value returned from the subsequent count command will depend on which mongos it is sent to. That mongos will only return the count of documents from the shard it thinks is the home of that collection.

      I also observed that flushing the router config on the mongos that disagrees with the config server fixes the problem, though it orphans the documents that it had inserted into the wrong shard.

      Show
      $ mlaunch init --single --sharded 3 --mongos run https://gist.github.com/anonymous/8b6783ab067f04e483d6 with 3.0.0-SNAPSHOT Java driver This program runs in a loop, on each iteration it inserts 100 documents into a collection, then does a count. If the count is 100, it drops the database and tries again. Expected results: The count is 100 on every iteration Actual results: The count usually hovers around 50. Analysis: The 3.0 Java driver load balances operations across all mongos servers provided in the connection string. So the 100 inserts are distributed roughly evenly across the two mongos servers. What should happen is that the mongos instances will agree on which is the primary shard for the database. But what actually happens is that sometimes they don't, and you can see in the shard primary logs that some of the inserts go to one shard, and some to another. The value returned from the subsequent count command will depend on which mongos it is sent to. That mongos will only return the count of documents from the shard it thinks is the home of that collection. I also observed that flushing the router config on the mongos that disagrees with the config server fixes the problem, though it orphans the documents that it had inserted into the wrong shard.

      A sharded cluster can create the same unsharded collection on more than one shard, with each mongos thinking that the collection lives on a different shard.

            Assignee:
            sheeri.cabral Sheeri Cabral (Inactive)
            Reporter:
            jeff.yemin@mongodb.com Jeffrey Yemin
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: