[SERVER-28153] Duplicate data is sometimes inserted with unique index Created: 01/Mar/17  Updated: 18/Apr/17  Resolved: 29/Mar/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 3.2.7, 3.2.12
Fix Version/s: None

Type: Bug Priority: Blocker - P1
Reporter: Phillip Wirth Assignee: Eric Milkie
Resolution: Duplicate Votes: 1
Labels: uic
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-28546 Documents can erroneously be unindexe... Closed
Operating System: ALL
Participants:

 Description   

I do have an unique index with a partialFilterExpression on a collection but duplicate data is sometimes inserted.

Index creation

getCollection().createIndex(new BasicDBObject(userId, 1)
        , new BasicDBObject("name", "uidx-something-user")
                .append("partialFilterExpression", new BasicDBObject(Properties.someting, new BasicDBObject("$eq", true)))
                .append("unique", true));

The index from the getIndicies command

{
        "v" : 1,
        "unique" : true,
        "key" : {
                "userId" : 1
        },
        "name" : "uidx-something-user",
        "ns" : "somewhere.something",
        "partialFilterExpression" : {
                "something" : {
                        "$eq" : true
                }
        }
}

The duplicated Documents

{
        "_id" : "08a8506c-bcbc-4ed6-9972-67fd7c37b4bc",
        "userId" : "1068",
        "express" : false,
        "something" : true,
        "items" : [ ],
        "recipient" : {
                "_id" : "efbd8618-c480-4194-964e-f5a821edf695"
        }
}
{
        "_id" : "b6695c6a-f29d-4531-96ac-795f14c72547",
        "userId" : "1068",
        "express" : false,
        "something" : true,
        "items" : [ ],
        "recipient" : {
                "_id" : "4f93fe38-edb2-4cb7-a1b3-c2c51ac8ded1"
}

MongoDb version: 3.2.7, seems also to happen with 3.2.12 - at least

A Sidenote: When dumping the collection and restoring it, a duplicate key error is thrown

Why is it sometimes possible to insert duplicate data and how to avoid that?



 Comments   
Comment by Eric Milkie [ 29/Mar/17 ]

I discovered a problem with partial index filters on unique indexes with WiredTiger that I believe is what you were experiencing here. Please follow the linked ticket for further updates.

Comment by Eric Milkie [ 27/Mar/17 ]

Hi pwirth
Can you confirm the issue is no longer happening for you? You mentioned previously that the issue might be happening with a different index; did you drop that index now?
I would be interested to know if you recreated the indexes with unique constraints but without a partial index filter, in order to narrow down exactly what features may be causing the issue.

Comment by Phillip Wirth [ 22/Mar/17 ]

Hi this issue is not present anymore since we do use unique indexes with a partialExpressionFilter anymore.

Comment by Ramon Fernandez Marina [ 21/Mar/17 ]

pwirth, if this is still an issue for you, can you please provide the information that Thomas requested in the previous comment?

Thanks,
Ramón.

Comment by Kelsey Schubert [ 13/Mar/17 ]

Hi pwirth,

Would you please provide the complete output of the following commands against the affected collections?

  • db.collection.getIndexes()
  • db.colllection.stats({indexDetails: true})

Thank you,
Thomas

Comment by Phillip Wirth [ 13/Mar/17 ]

The problem still exists now with a different Index...

From 38 Documents only 28 were in the Index. Calling reIndex on the collection resolved the issue, however I guess that in further duplicate documents will be present.

I Just found out that there isn't any _id index which is very weird!

Comment by Phillip Wirth [ 13/Mar/17 ]

Hi Eric, unfortunately this isn't possible anymore since we changed some fields and dropped the index since this all was happening live.

And I can't provide you any server logs since we do not have any permission to read that file ......

Comment by Eric Milkie [ 10/Mar/17 ]

Since this index is clearly not working, would you like to try dropping and recreating it with the same spec? It would be interesting to see if the index rebuild would discover the unique key violations, or not. This would help narrow down where the issue lies.
If the index doesn't rebuild but gets an error due to the duplicates, I would presume this would be okay since you are planning on changing this collection and indexes today anyway.

Comment by Eric Milkie [ 10/Mar/17 ]

Thanks for that info. I have been using upserts with string _id fields in my repro attempts, so it's still not clear if that particular code path is involved in this issue.
Can you provide the beginning of the server log, which contains the build info? i'd like to confirm i am using the exact server version that you are.

Comment by Phillip Wirth [ 10/Mar/17 ]

Hey Eric,

I just found out that our code does not perform a simple insert.

        try {
            this.collection.save(dbObject);
        } // exception handling

The documentation on save

Update an existing document or insert a document depending on the parameter. If the document does not contain an '_id' field, then the method performs an insert with the specified fields in the document as well as an '_id' field with a unique objectId value. If the document contains an '_id' field, then the method performs an upsert querying the collection on the '_id' field:
If a document does not exist with the specified '_id' value, the method performs an insert with the specified fields in the document.
If a document exists with the specified '_id' value, the method performs an update, replacing all field in the existing record with the fields from the document.

Since we generate a string _id field an upsert will be performed.

Comment by Phillip Wirth [ 08/Mar/17 ]

Have you seen evidence of this?

https://jira.mongodb.org/browse/SERVER-28153?focusedCommentId=1518375&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1518375 I noted it here after I looked closely in the documents which are not in the index (anymore) and found around 80 documents that match this pattern

In version 3.4, we enhanced the validate command to detect these types of problems with partial indexes

sounds great!

Another option to explore: To determine if WiredTiger is the culprit here

well we can't even get the mongod logs from this server ... we will see after Friday - I have to leave for today - good night to where ever you are

Comment by Eric Milkie [ 08/Mar/17 ]

It is too late for such a check now, but I do not think that such a thing would be helpful since there are documents updated and then disappeared, sometimes after days, from the index somehow. Which implies that they were in the index once they got inserted.

Have you seen evidence of this? (I didn't see anything in the discussion so far in this ticket that showed that.) This fact would narrow down the areas of the code to look at.
In version 3.4, we enhanced the validate command to detect these types of problems with partial indexes, so for your case I could imagine periodically running validate to more closely determine when the records go missing from the index.

Another option to explore: To determine if WiredTiger is the culprit here, we would require you to send us your journal files immediately after an inserted or updated record did not get indexed; in addition, a special WiredTiger option to not discard journal files after a checkpoint, in case a checkpoint happens to run. If you are willing to explore this option, I can send more details.

Comment by Phillip Wirth [ 08/Mar/17 ]

I'd like you to try dropping the index and recreating it the same way

It will get dropped however a new index with slightly different behavior replaces this one. So we can't reproduce this issue anymore on next Friday.

I'd also like to determine immediately when newly inserted or updated documents are not indexed that ought to be. Since we haven't been able to narrow down which layer of the database has the bug, I think the best layer in which to put a check is the application layer.

It is too late for such a check now, but I do not think that such a thing would be helpful since there are documents updated and then disappeared, sometimes after days, from the index somehow. Which implies that they were in the index once they got inserted.

Some other actions that may result in a change in behavior would be to upgrade

I suggested already updating at least to 3.2.12 to my colleagues ... we will see what Friday brings at first for our next release.

Thanks also for the help so far - Maybe I will find the time on Friday to write a batch script which to trigger this faulty behavior somehow.

Comment by Eric Milkie [ 08/Mar/17 ]

I've discussed this situation further with my colleagues. I'd like you to try dropping the index and recreating it the same way, to see if it still has the same issue. You will need to manually clear out the documents with conflicting "userId" fields first.
I'd also like to determine immediately when newly inserted or updated documents are not indexed that ought to be. Since we haven't been able to narrow down which layer of the database has the bug, I think the best layer in which to put a check is the application layer. Would you be willing to add some application code that confirms after a record is successfully inserted with "

{something:true}

" that it exists in the uidx-something-user index? As we take steps to narrow down the problem, I'd like to have immediate signal on whether a given step has caused a change in behavior, and confirming that inserts are working properly or not is the best way to achieve that.
Some other actions that may result in a change in behavior would be to upgrade to 3.2.12, or to 3.4.2, although those actions wouldn't necessarily definitively expose the bug. Do you have any upgrades already planned for this database?
Thanks for your understanding and assistance while we diagnose this issue!

Comment by Phillip Wirth [ 08/Mar/17 ]

On another collection with 3 indices with partialFilterExpression where the filter consists of 2 exits operations combined everything works fine.

So everything points to a faulty check with new BasicDBObject(Properties.someting, new BasicDBObject("$eq", true)) what do you think Eric?

Comment by Phillip Wirth [ 07/Mar/17 ]

When documents get updated, which fields typically change?

Usually, items and recipient, Also some fields will can get added to the root document, but there is no pattern visible...

Would such updates move a document into or out of one of the partial indexes on the collection?

That is not possible by applications logic

do you have any other indexes on this collection, or just the three you described so far?

There are no other indices.

I can now confirm, there are documents which got updated but are not in the index...

Comment by Eric Milkie [ 07/Mar/17 ]

Interesting. When documents get updated, which fields typically change? Would such updates move a document into or out of one of the partial indexes on the collection?
I also meant to ask: do you have any other indexes on this collection, or just the three you described so far?

Comment by Phillip Wirth [ 07/Mar/17 ]

1.

It would be an interesting data point if you found documents with the same userId and more than 0 of them were in the uidx-something-user index

Such a document does not exist.

2.

Also interesting would be the presence of a document with a unique userId and not present in the index

371 of 786 unique userIds were not found in the index.

It gets more weird now

3.

Also, is your application code path different when it inserts a document with {something:true}

No. , the insert is always the same, in case of something:false there are some more fields populated irrelevant to the uidx-something-user index. but relevant for idx-another-index since one these additional field builds the unique key.

I found 112 documents which got updated after they got inserted which are not in the index o.O query I used was broken.

Comment by Eric Milkie [ 07/Mar/17 ]

Also, is your application code path different when it inserts a document with {something:true} versus a document with {something:false}? Would such writes possibly have different parameters to the update command, via the Java driver?

Comment by Eric Milkie [ 07/Mar/17 ]

It would be an interesting data point if you found documents with the same userId and more than 0 of them were in the uidx-something-user index. That might imply that the documents now missing from the index were at one point present in the index, and sometime later those records were either explicitly removed via a bug in the insert logic, or lost by the storage engine.
Also interesting would be the presence of a document with a unique userId and not present in the index. That would suggest that unique constraint violation is not involved in this bug, and is just a symptom rather than a trigger.

Comment by Phillip Wirth [ 07/Mar/17 ]

I do now have a test collection with all the documents matching "something:false"

db.collection.find({something:true}).forEach(function(doc) {
 db.collection_all_something.insert(doc);
 });

and a collection with all the documents matching "something:false" from the index.

db.collection.find({something:true}).hint("uidx-something-user").forEach(function(doc) {
 db.collection_from_index.insert(doc);
 });

The first thing I did was to find out if this only affecty some specific userId s ... but no.

db.collection_all_something.aggregate([ {"$group" : {_id:"$userId", count:{$sum:1}}}, {$sort:{"count":-1}} ])

There is one user having 6 Documents, some having 5 or 4 but the most having 3 or 2 documents.

I looked a bit further...
It dosent seem to do with the userId field at all.

db.collection_all_something.find({"userId" : "37"})

Returns 6 documents.

When looking in the collection derived from the index:

db.something_from_index.find({"userId" : "37"})

Nothing has been found. This pattern is true for most of the documents ... I just found a few exceptions yet but even they don't tend to be something special.

The time window is now from mid of january (or before - I do not have all logs ... ) till now.

Comment by Eric Milkie [ 07/Mar/17 ]

Sorry, I forgot you are using a user-defined _id field. Can you check that the _id index has all the records (~4333 or whatever the current record count is now)?
This will confirm that the records that didn't get inserted into uidx-something-user still got inserted into the _id index.

Comment by Phillip Wirth [ 07/Mar/17 ]

I expect that "idx-another-index" is a partial index filtered on "something:false"?

- Correct.

There is a difference between those 2 indicies, idx-another-index covers also more fields which should, based on the business logic be unique.

Just to let you see the other index, sorry for this obfuscation...

> db.collection.getIndices()
[
        {
                "v" : 1,
                "unique" : true,
                "key" : {
                        "recipient.customerNumber" : 1,
                        "userId" : 1
                },
                "name" : "idx-another-index"",
                "ns" : "dbname.collection",
                "partialFilterExpression" : {
                        "something" : {
                                "$eq" : false
                        }
                }
        }
...

Since the collection is not that big I will try my best.

The time window can't be the reason as I said previously

in my researches I found out that there were days between the creation date of these 2 documents

Unfortunately there is no ObjectId, just a self written _id field which is a random UUID so no chance to get any timestamp from and also no other field to determine when the document was inserted. I can just grep through the logs, which are incomplete ...

Comment by Eric Milkie [ 07/Mar/17 ]

Those counts are very strange. 3/4 of the records that should be in the uidx-something-user index are erroneously not indexed.
I expect that "idx-another-index" is a partial index filtered on "something:false"?
It does seem as though all of the missing index records are from the uidx-something-user index; the idx-another-index appears to have no missing records.
Do you think you can try to determine if all the records that are incorrectly unindexed have something in common? I discussed with my colleagues and there isn't a quick way to determine exactly which records are unindexed; you would have to query all the records with

{something:true}

, and then query all the records in the uidx-something-user index via a hint, and then compare the two query responses. Perhaps all the incorrectly indexed records shared the same insert time window, or have the same configuration of fields?
We can parse the ObjectId's for the insert times of the records to help determine this.

Comment by Phillip Wirth [ 07/Mar/17 ]

Seems legit.

Current count: 4333
idx-another-index: 3113
uidx-something-user: 486

db.something.find({something:true}).itcount()
1222

Comment by Eric Milkie [ 07/Mar/17 ]

Can you also count the number of documents with {something:true}? We can compare that number with 485 to learn how many records are not in the index but should be.

db.something.find({something:true}).itcount()

Comment by Phillip Wirth [ 07/Mar/17 ]

Hi Erik,

the collection is very small, just 4331 entries currently.

> db.collection.validate(true);
{
        "ns" : "dbname.collection",
        "nrecords" : 4331,
        "nIndexes" : 2,
        "keysPerIndex" : {
                "dbname.collection.$idx-another-index" : 3112,
                "dbname.collection.$uidx-something-user" : 485 <-------------------- the broken one
        },
        "indexDetails" : {
                "dbname.collection.$idx-another-index" : {
                        "valid" : true
                },
                "dbname.collection.$uidx-something-user" : {
                        "valid" : true
                }
        },
        "valid" : true,
        "errors" : [ ],
        "ok" : 1
}

Interresting fact:

The both indicies together should cover all documents but there is a huge gap.

4331 - 3112 != 485

The other index is a partial index too but those problems only occur with the one previously mentioned, uidx-something-user

And could you please focus on version 3.2.7, I am pretty sure this bug exists in 3.2.12 too but I can't verify that :/

Comment by Eric Milkie [ 07/Mar/17 ]

Hi Phillip,
If possible, could you try running the validate command with the "full" option? If you have a lot of data, it will lock your server while it runs, so this may not be viable for you to run on a production server. But it should definitively identify all unique key violations in the collection you run it on.

> db.collection.validate(true);

Comment by Eric Milkie [ 07/Mar/17 ]

I took the information you provided and have been attempting to write reproduction scripts to try to suss this out. In the past few days I have examined the code in-depth but haven't found any smoking guns yet.

Comment by Phillip Wirth [ 03/Mar/17 ]

1.
This is the only way documents get inserted:

        BasicDBObject dbObject = (BasicDBObject)JSON.parse(documentString);
 
        try {
            this.collection.save(dbObject);
        } // exception handling

With "the specific fields" I mean userId and something

3.

db.collection.find({"userId" : "1068"}).pretty()


Will find the documents.

db.collection.find({"_id" : {"$in" : ["b6695c6a-f29d-4531-96ac-795f14c72547", "08a8506c-bcbc-4ed6-9972-67fd7c37b4bc"]}}).pretty()

Will find the documents.

db.collection.find({userId : "1068"}).hint("uidx-something-user")

Will not find the documents

db.collection.find({"_id" : {"$in" : ["b6695c6a-f29d-4531-96ac-795f14c72547", "08a8506c-bcbc-4ed6-9972-67fd7c37b4bc"]}}).hint("uidx-something-user")

Will not find the documents

Side note: in my researches I found out that there were days between the creation date of these 2 documents, ( 3 or up to a week not sure anymore) - but the index was created way before.

Since this is live data there will be an update next Friday which will delete the index and create another one with different options. Maybe this will fix the problem (But I bet not since it is also a uniquie - partialExpression filter index. Well in other words: the windows to research this bug is closing. The only thing that I will have afterwards is a dump with a broken index. (Which I can't restore with indicies as previously mentioned)

Comment by Eric Milkie [ 03/Mar/17 ]

Thank you for your answers so far! I need some clarification.

1. What you mean by "specific fields" – which exact ones? If possible, can you supply the exact invocation for writes that resulted in the conflicting documents?

3. Are you certain none of the documents are in the index? This fact is surprising to me. I would expect either one or both of a conflicting pair of documents to be in the index. Can you show the exact queries and output that show this?

Comment by Phillip Wirth [ 03/Mar/17 ]

Hi @milkie

1. Just normal inserts cause the problem, the specific fields are never updated again, but other parts of the document with simple updates

2. It is a standalone node.

3. The documents in question are not in the index.

4. No document validation is used, db.getCollectionInfos() prints only collection names with empty options on every collection.
5.

> db.serverStatus().storageEngine
{
	"name" : "wiredTiger",
	"supportsCommittedReads" : true,
	"persistent" : true
}

6. Maybe once a week, not quite sure since the error causes subsequenz errors when the data gets reimported on another server

Comment by Eric Milkie [ 02/Mar/17 ]

Hi pwirth
In order to diagnose this problem, I'm hoping you can help supply some diagnostic data:
1. How are the conflicting documents written? Are they inserted and then subsequently updated? Are the updates using the multi or upsert flag? If possible, can you supply the exact invocation for writes that resulted in the conflicting documents?
2. Is this server part of a replica set or a standalone node? If it is a replica set, could you find all oplog entries involving the conflicting documents in question and paste them into this ticket as JSON?
3. Are the conflicting documents both present in the unique index "uidx-something-user", or is one of them missing? To determine this, you can issue a query with a hint to use uidx-something-user and filter on the _id of each conflicting document.
4. Can you paste the output of db.getCollectionInfos() on the database that contains uidx-something-user? I want to see, in particular, if you are using document validation.
5. Are you using the WiredTiger storage engine or mmap?
6. How often do you suspect duplicate data occurs? Does your application often attempt to write duplicates which result in a duplicate-key error?

Comment by Phillip Wirth [ 02/Mar/17 ]

Could be a duplicate of SERVER-25791 ?

Generated at Thu Feb 08 04:17:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.