[SERVER-16836] Cluster can create the same unsharded collection on more than one shard Created: 14/Jan/15  Updated: 24/Jun/19  Resolved: 24/Jun/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.7, 2.8.0-rc4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Sheeri Cabral (Inactive)
Resolution: Duplicate Votes: 0
Labels: ShardingRoughEdges
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-17397 Dropping a Database or Collection in ... Closed
Operating System: ALL
Steps To Reproduce:

$ mlaunch init --single --sharded 3 --mongos

run https://gist.github.com/anonymous/8b6783ab067f04e483d6 with 3.0.0-SNAPSHOT Java driver

This program runs in a loop, on each iteration it inserts 100 documents into a collection, then does a count. If the count is 100, it drops the database and tries again.

Expected results:

The count is 100 on every iteration

Actual results:

The count usually hovers around 50.

Analysis:

The 3.0 Java driver load balances operations across all mongos servers provided in the connection string. So the 100 inserts are distributed roughly evenly across the two mongos servers. What should happen is that the mongos instances will agree on which is the primary shard for the database. But what actually happens is that sometimes they don't, and you can see in the shard primary logs that some of the inserts go to one shard, and some to another. The value returned from the subsequent count command will depend on which mongos it is sent to. That mongos will only return the count of documents from the shard it thinks is the home of that collection.

I also observed that flushing the router config on the mongos that disagrees with the config server fixes the problem, though it orphans the documents that it had inserted into the wrong shard.

Participants:

 Description   

A sharded cluster can create the same unsharded collection on more than one shard, with each mongos thinking that the collection lives on a different shard.



 Comments   
Comment by Kaloian Manassiev [ 24/Jun/19 ]

oleg.pudeyev, I just read the Java program linked in the description and I see that it drops the database and then re-creates it. It is a known problem that upon change of the database primary, different router nodes will not be made aware of the change and this is tracked underĀ SERVER-17397.

Generated at Thu Feb 08 03:42:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.