-
Type:
Bug
-
Resolution: Works as Designed
-
Priority:
Critical - P2
-
None
-
Affects Version/s: 3.6.6
-
Component/s: None
-
None
-
ALL
-
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Hi,
I'm relatively to Mongo and seem to have stumbled upon an apparent race condition / bug in a sharded Mongo deployment.
Please advise if I'm missing something fundamental?
Problem Summary:
Very rarely, though reproducible, duplicate documents can be inserted into a hashed sharded collection, even when a unique index exists that should explicitly prevent this from occurring.
Please see the details below.
Test Environment:
The race condition was originally reproduced with a larger deployment (3 member replica sets), but it was simplified as follows for reproduction:
General environment:
- Mongo Community Edition v3.6.6
- CentOS Linux release 7.4.170
- Everything running on localhost
Minimum sharded cluster:
- 1 config-server (replica set with only 1 member)
- 3 shards (replica set with only 1 member per shard) using WiredTiger storage
- 2 mongoses
Application:
- Test application that uses the Mongo C driver.
- Each instance connects to both mongoses and issues insert & delete commands as described later.
Test Purpose:
In a sharded collection using hashed sharding, verify that duplicate documents (those with the same shard key value) cannot be inserted but are instead rejected.
For example, two good examples are demonstrated below where a "duplicate key error" is returned since the shard key value already exists in the collection.
Good Example #1: using _id as the hashed shard key
mongos> db.repo1.insert( { "_id" : "a9223000000000000001", "fn1" : "1" } ) WriteResult({ "nInserted" : 1 }) mongos> db.repo1.insert( { "_id" : "a9223000000000000001", "fn1" : "1" } ) WriteResult({ "nInserted" : 0, "writeError" : { "code" : 11000, "errmsg" : "E11000 duplicate key error collection: dbClientTest.repo1 index: _id_ dup key: { : \"a9223000000000000001\" }" } })
Good Example #2: using "repo1pk" field-name as the hashed shard key (with a unique index also defined)
mongos> db.repo1.insert( { "repo1pk" : "a9223000000000000001", "fn1" : "1" } ) WriteResult({ "nInserted" : 1 }) mongos> db.repo1.insert( { "repo1pk" : "a9223000000000000001", "fn1" : "1" } ) WriteResult({ "nInserted" : 0, "writeError" : { "code" : 11000, "errmsg" : "E11000 duplicate key error collection: dbClientTest.repo1 index: repo1pk_1 dup key: { : \"a9223000000000000001\" }" }})
Example #2 is really how my application behaves, even though I reproduced the problem below using _id just to see if it would behave differently (it doesn't).
Race Condition / Bug:
I wrote a test application that attempts to insert & delete a single document - repeat forever - where "_id" is my hashed shard key:
{ "_id" : "a9223000000000000001", "fn1" : "1", "fn2" : 2 }
Four instances of my test driver are started and compete trying to create & delete the single document above.
The sharded collection should have at most 0..1 documents at any given time, and this is generally the case. However very rarely a duplicate document is inserted, where my tool detects this during document deletion when Mongo responds with { "deletedCount" : 2 } since it deletes all documents matching { "fn" : "1" }.
When my tool detects that more than one document was deleted it spits out:
[TID 0x7f3d07fdf700] Unexpected: 2 documents deleted; expecting 1
For example, during a 5 minute test run there were 270,000 create (insert) attempts where 134,514 were successful since the doc didn't yet exist, where 141 duplicate documents were detected during the delete cycle:
[TID 0x7fe08f7fe700] Create Sent=65000 OK=31914; Delete Sent=65000 OK=32214 >1Total=33 [TID 0x7f68daffd700] Create Sent=65000 OK=32312; Delete Sent=65000 OK=32308 >1Total=28 [TID 0x7fbe30fd9700] Create Sent=70000 OK=35301; Delete Sent=70000 OK=34969 >1Total=44 [TID 0x7fbb6904b700] Create Sent=70000 OK=34987; Delete Sent=70000 OK=34859 >1Total=36
To confirm that duplicate documents were actually inserted, I wrote a reader that constantly tries to read the "fn1" fieldname, where it can sometimes catch instances of duplicate documents as shown below:
{ "totalRecordsRetrieved" : 2, "records" : [ { "_id" : "a9223000000000000001", "fn1" : "1", "fn2" : 2 }, { "_id" : "a9223000000000000001", "fn1" : "1", "fn2" : 2 } ]}
The above behavior seems to be unexpected and a race condition in light of the following:
Mongo documentation notes that the above behavior shouldn't be possible because of the existence of a 'unique' index for _id (or in the case where I explicitly create a unique index when using "repo1pk" as the hashed shard key):
By default, MongoDB creates a unique index on the _id field during the creation of a collection. The unique constraint applies to separate documents in the collection. That is, the unique index prevents separate documents from having the same value for the indexed key. https://docs.mongodb.com/manual/core/index-unique/
Also Kevin Adistambha with Mongo recently wrote that when inserting or updating a document that, "The index and the collection will be updated as a single atomic operation." So it would seem that an attempt to insert a duplicate document should be caught and rejected due to the presence of a unique index?
Other data:
This is the dummy document that is created, deleted, repeat forever.
mongos> db.repo1.find() { "_id" : "a9223000000000000001", "fn1" : "1", "fn2" : 2 }
The set of defined indexes for my test. I doubt this matters, but I've reproduced with & without an index on "fn1" below.
mongos> db.repo1.getIndexes() [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "dbClientTest.repo1" }, { "v" : 2, "key" : { "_id" : "hashed" }, "name" : "_id_hashed", "ns" : "dbClientTest.repo1" }, { "v" : 2, "key" : { "fn1" : 1 }, "name" : "fn1_1", "ns" : "dbClientTest.repo1" } ]
Here is the shard status when reproducing the problem. Curious: why is "unique: false" stated below, even though a unique index is created by default (for _id) and is otherwise (usually) enforced?
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5b7ad9a050c13af8f291f1f9") } shards: { "_id" : "shard1", "host" : "shard1/localhost:50001", "state" : 1 } { "_id" : "shard2", "host" : "shard2/localhost:50002", "state" : 1 } { "_id" : "shard3", "host" : "shard3/localhost:50003", "state" : 1 } active mongoses: "3.6.6" : 2 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 8 : Success databases: { "_id" : "config", "primary" : "config", "partitioned" : true } config.system.sessions shard key: { "_id" : 1 } unique: false balancing: true chunks: shard1 1 { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 0) { "_id" : "dbClientTest", "primary" : "shard3", "partitioned" : true } dbClientTest.repo1 shard key: { "_id" : "hashed" } unique: false balancing: true chunks: shard1 2 shard2 2 shard3 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard1 Timestamp(3, 2) { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard1 Timestamp(3, 3) { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard2 Timestamp(3, 4) { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard2 Timestamp(3, 5) { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard3 Timestamp(3, 6) { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard3 Timestamp(3, 7)