[SERVER-5928] Non-atomic upsert Created: 25/May/12  Updated: 06/Mar/14  Resolved: 30/May/12

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: 2.0.5
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Aristarkh Zagorodnikov Assignee: Tyler Brock
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I was under impression, that update w/upsert is atomic (since it operates on a single, though maybe not-yet-existing document). Unfortunately, when under moderate load we see duplicate key errors on id for upsert operations. I've read the docs and got the impression that $atomic modifier is not a solution (docs describe it as a way to do _multiple document updates concurrently). Is there something broken in the server, documentation or my own head? =) If it's a documentation issue (I suspect that, but I feel sorry for myself updating all our code that uses upserts), please update the docs with a clear statement that "upsert still can fail on non-unique IDs, use XXX instead".



 Comments   
Comment by auto [ 07/Jun/12 ]

Author:

{u'login': u'tychoish', u'name': u'Sam Kleinman', u'email': u'samk@10gen.com'}

Message: SERVER-5928 clarifying upsert documentation
Branch: master
https://github.com/mongodb/docs/commit/28d3c27efff41ef443ff380d6e1f9eede374ea8c

Comment by Aristarkh Zagorodnikov [ 30/May/12 ]

Good to hear, thanks for the update.

Comment by Tyler Brock [ 30/May/12 ]

Ok, I totally agree the documentation should be as good as possible. Thank you for bringing it up.

I've improved some of the update() docs to make it a little more clear that the determining factor for whether or not to do an update or insert comes from the criteria document passed as the first argument to update.

Comment by Aristarkh Zagorodnikov [ 30/May/12 ]

Yes, your explanation is perfect, having something like that in docs near "upsert" surely would prevent surprising problems. And I understand that MongoDB doesn't do anyting "smart", I was just pointing out that it doesn't and in my opinion shouldn't =)

Comment by Tyler Brock [ 30/May/12 ]

Hey Aristarkh,

Thank you for posting the code. Let's go through what is happening first so we are on the same page before we address the question.

The problem here is that the doc.version is still 2 on the second update.

The variable doc still contains the document retrieved by the initial findOne query. When you go to do a subsequent upsert the expression actually looks like this after being evaluated by javascript but before the query is run:

db.test.update({_id:1, version:2},{$set:{data:3}, $inc:{version:1}}, true);

This is an upsert, and so, because that document doesn't exist, MongoDB tries to insert it and fails.

It fails because document with the key _id equal to 1 is already in the collection and there is a unique index on that field. There can't be any other documents in the collection with that same key value.

Now, to answer you question, MongoDB is not doing anything smart here. I believe you are interpreting this behavior as Mongo looking to see if _id, the non-unique key, is non unique before inserting. That is not the case.

What happens is Mongo queries for a document that matches the query document portion of the update:

{_id: 1, version: 2}

It does not exist. This normally means that no update should occur, but because you are performing an upsert, Mongo attempts to insert the document and fails, because the _id is not unique.

Let me know if that makes sense. If this is at all unclear please let me know so we can update our documentation appropriately.

Comment by Aristarkh Zagorodnikov [ 27/May/12 ]

The following JS code should better illustrate what am I doing:

// Insert test document
> db.test.insert({_id:1, version:2, data:3})
// Get current document
> var doc = db.test.findOne({_id:1});
> printjson(doc);
{
        "_id" : 1,
        "version" : 2,
        "data" : 3
}
// Perform some updates on doc.data
...
// Update
> db.test.update({_id:1, version:doc.version},{$set:{data:doc.data}, $inc:{version:1}}, true);db.getLastError();
// Update succeeded
null
// Try once more
> db.test.update({_id:1, version:doc.version},{$set:{data:doc.data}, $inc:{version:1}}, true);
// Failed with "non-unique key" since "version" field changed
E11000 duplicate key error index: d2_feed0.feed.$_id_  dup key: { : 1 }

I think that MongoDB should not try be "smarter" and filter out non-unique index keys before querying to prevent this (when I first encountered this, I expected having "affected documents count" being zero instead of "non-unique key" error), but the current mechanism hurts learnability (you do query on _id only and everything works as expected, you add another fields and you get unexpected error that is not easy to grasp at first). I propose only documentation improvements, not the core server changes (unless that's the plan).

Comment by Eliot Horowitz (Inactive) [ 27/May/12 ]

Can you try this with a 2.1 nightly?

Comment by Aristarkh Zagorodnikov [ 27/May/12 ]

I never asked about multiple updates (the documentation is pretty clear about this), please pay attention when reading ticket description.
I was talking about upsert functionality and how it operates unexpectedly, returning "duplicate key error", when one uses upsert with (for example) two fields in "query" part of the update operation.
In our code this pattern is very common, since we use it to perform version-based optimistic concurrency updates: update the document only if it's "version" (actually the "v" field) didn't change between read and write, incrementing it, elseway repeat the read-modify-write cycle. This leads us to requirement that query should be compound (

{ _id: <some_id>, v: <current_version> }

). If the document is updated in some other thread, update can't match the "v" part of the query and tries to insert document.

Comment by Tyler Brock [ 25/May/12 ]

Atomicity within MongoDB is at the document level.

There is currently no way to do multiple updates concurrently unless those updates use the atomic modifiers ($inc, $set, etc...) to modify a single document.

To update multiple documents atomically you would need transaction support which does not currently exist in mongodb.

Comment by Tyler Brock [ 25/May/12 ]

Aristark could you post a the query you are having trouble with?

Comment by Aristarkh Zagorodnikov [ 25/May/12 ]

Yes, this is definitely the case. Sorry for bothering you with this. The "Upsert with modifiers" section of http://www.mongodb.org/display/DOCS/Updating (or the new manual) should have something on this topic.

Comment by Aristarkh Zagorodnikov [ 25/May/12 ]

It appears that the root cause in my case is that query is compound one.

Generated at Thu Feb 08 03:10:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.