[SERVER-14322] Retry on predicate unique index violations of update + upsert -> insert when possible Created: 20/Jun/14 Updated: 04/Dec/23 Resolved: 29/Jan/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Write Ops |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Scott Hernandez (Inactive) | Assignee: | Backlog - Query Team (Inactive) |
| Resolution: | Duplicate | Votes: | 117 |
| Labels: | storch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: |
Amit Gupta, Andrew Doumaux, Andy Schwerin, Arthur Darcet, Asya Kamsky, Backlog - Query Team, bh, bob whitehurst, Brad Vogel, Daniel Barreto, Dnv.IytH, Gaël Jourdan-Weil, Guillaume Rousseau, James Perryman [X], James Wahlin, Jan Kondratowicz, Jon Hyman, Jordi Salvat i Alabart, Kuan-Ru Fu, Michael Kania, nagisatosanae, Ramon Fernandez Marina, Ryan, Scott Hernandez, Sergio Rykov, Sherry Ummen, Tim Widmann, Tudor Aursulesei, Vishnukanth Reddy Boreddy, 아나 하리
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Issue Status as of Jan 29, 2019 ISSUE DESCRIPTION AND IMPACT During an update with upsert:true option, two (or more) threads may attempt an upsert operation using the same query predicate and, upon not finding a match, the threads will attempt to insert a new document. Both inserts will (and should) succeed, unless the second causes a unique constraint violation. This MongoDB behavior is documented here. It is expected that the client will take appropriate action upon detection of such constraint violation. Appropriate action may vary depending on individual application requirements. IMPROVEMENT DESCRIPTION As part of
The following table contains examples of upsert operations that the server can automatically retry on DuplicateKey error, and examples where the server cannot automatically retry:
This ticket will now be resolved as duplicate of VERSIONS Original descriptionIt is possible that two updates come in with upsert:true, resulting in neither finding a document and both inserting new documents which conflict on unique index violations of the query predicate. In this case it is possible for the server to retry any failed updates. Currently clients can retry the update to get the same behavior which is expected from the server. This affects both updates and findAndModify. See |
| Comments |
| Comment by James Wahlin [ 21/Oct/22 ] | |||||||||||||||||||||||||
|
vickyrddy97@gmail.com this fix is not tied to FCV version and will be available under FCV 4.4 with the 5.0 binary. | |||||||||||||||||||||||||
| Comment by Vishnukanth Reddy Boreddy [ 21/Oct/22 ] | |||||||||||||||||||||||||
|
@James Wahlin Do we need to have FCV also updated to 5.0 for the fix to work or just DB upgrade to 5.0 and FCV at 4.4 would also work? | |||||||||||||||||||||||||
| Comment by Gaël Jourdan-Weil [ 28/Sep/21 ] | |||||||||||||||||||||||||
|
Thanks @James Wahlin for the notice | |||||||||||||||||||||||||
| Comment by James Wahlin [ 27/Sep/21 ] | |||||||||||||||||||||||||
|
| |||||||||||||||||||||||||
| Comment by Jordi Salvat i Alabart [ 27/Sep/21 ] | |||||||||||||||||||||||||
|
Same here. I'll open a separate ticket. | |||||||||||||||||||||||||
| Comment by Kuan-Ru Fu [ 10/Sep/21 ] | |||||||||||||||||||||||||
|
I also get the same problem with `findAndModify` on mongodb 4.2. Does this improvement already on the mongodb 4.2? | |||||||||||||||||||||||||
| Comment by Gaël Jourdan-Weil [ 02/Sep/21 ] | |||||||||||||||||||||||||
|
Is the auto retry mechanism applied in case of `findAndModify`? I'm using Mongo 4.2 and experiencing the error with `findAndModify` operations even though my query match all the conditions for auto retry. | |||||||||||||||||||||||||
| Comment by Arthur Darcet [ 14/Jun/18 ] | |||||||||||||||||||||||||
|
@Andy Schwerin Just to be sure I understand this correctly: Why isn't it enough to check if the operation tried an insert + the duplicate document that triggered the exception matches the predicate that was sent?
| |||||||||||||||||||||||||
| Comment by Andy Schwerin [ 13/Nov/17 ] | |||||||||||||||||||||||||
|
This behavior is distinct from the one that retryable writes addresses. Retryable write support requires being certain about whether a previously attempted write happened or not, in the face of communication errors. This behavior occurs even in the presence of certain knowledge about whether a write succeeded or failed. Indeed, the DuplicateKey error tells the client in no uncertain terms that the write failed. The challenge for automatic retry on duplicate key error is that we cannot know that the violated constraint is the unique index constraint from the match criteria, rather than some other unique index that might exist on the document. Handling a situation in which there may be multiple unique indexes correctly is what makes this harder to do in MongoDB than in the application, which knows what's what. Put another way, automatically retrying on a DuplicateKey error might fail indefinitely if the unique index constraint violated was not the one from the match expression. | |||||||||||||||||||||||||
| Comment by Sherry Ummen [ 13/Nov/17 ] | |||||||||||||||||||||||||
|
So now since 3.6 has re-tryable writes. Does that mean that now using 3.6 this will not be a problem? | |||||||||||||||||||||||||
| Comment by Amit Gupta [ 13/Nov/17 ] | |||||||||||||||||||||||||
|
I believe I'm experiencing this issue as well. It seems hinted in the comments above that creating a unique index on the query predicate will either reduce or eliminate this issue? Is either case true? Any further explanation of how/why a unique index affects the occurrence of the issue? | |||||||||||||||||||||||||
| Comment by Brad Vogel [ 13/Oct/17 ] | |||||||||||||||||||||||||
|
I saw that Mongo 3.6 has retryable rewrites (https://groups.google.com/forum/#!msg/mongodb-announce/BwDlVtKw74I/HH86-nnhAgAJ). Does that mean this is fixed? | |||||||||||||||||||||||||
| Comment by nagisatosanae [ 01/Sep/17 ] | |||||||||||||||||||||||||
|
Extremely critical!! | |||||||||||||||||||||||||
| Comment by bh [ 27/Jun/17 ] | |||||||||||||||||||||||||
|
+1 This is annoying and wastes a bit of time for `updateOne`, but its a breaking feature for bulk writes, when a single failure affects the entire group of updates. | |||||||||||||||||||||||||
| Comment by Tudor Aursulesei [ 17/Apr/17 ] | |||||||||||||||||||||||||
|
+1 | |||||||||||||||||||||||||
| Comment by James Perryman [X] [ 12/Jan/17 ] | |||||||||||||||||||||||||
|
This is a pretty big issue and remains in current version 3.4. Entails mongodb cannot be used properly in two phase commit scenario. Please provide an update. | |||||||||||||||||||||||||
| Comment by Sergio Rykov [ 09/Jan/17 ] | |||||||||||||||||||||||||
|
MongoDB 3.4 is there and now we are in 2017 with this issue. What's the state of issue? | |||||||||||||||||||||||||
| Comment by Guillaume Rousseau [ 16/Dec/16 ] | |||||||||||||||||||||||||
|
got index creation failure because of this bug. pymongo.errorspymongo.errors..DuplicateKeyErrorDuplicateKeyError: : E11000 duplicate key error. It kills the running process since index is running in background. | |||||||||||||||||||||||||
| Comment by Jan Kondratowicz [ 14/Dec/16 ] | |||||||||||||||||||||||||
|
Since we updated to WiredTiger, we're having hundreds of those errors daily. I believe it's a major issue, making write operations randomly unreliable. I've been encountering this issue on a non-sharded collection, running a single NodeJS process, which is hardly a parallel environment. | |||||||||||||||||||||||||
| Comment by Sherry Ummen [ 17/Oct/16 ] | |||||||||||||||||||||||||
|
We are lossing data quite often now because of this. Retry logic we have not added yet. Wondering whether this got any updates? Or any discussion happened on this issue ? | |||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 01/Sep/16 ] | |||||||||||||||||||||||||
|
Hi sherry.ummen@napa.fi, please see Andy's comment above. A solution to this ticket requires quite a bit of discussion that, unfortunately hasn't happened yet to the best of my knowledge. So while we'd like to address it in the upcoming MongoDB 3.4 (as indicated by the "3.3 Desired" fixVersion), it is likely that a fix for this ticket will slip, and users will have to continue to use application code to get around this behavior. We'll post an updated on this ticket if the current plan and schedule change. Regards, | |||||||||||||||||||||||||
| Comment by Sherry Ummen [ 31/Aug/16 ] | |||||||||||||||||||||||||
|
Hello, Could please tell what is the status of this issue? When will it be ready and in which release? Thanks | |||||||||||||||||||||||||
| Comment by bob whitehurst [ 07/Jun/16 ] | |||||||||||||||||||||||||
|
I am not sure if this may be helpful. I was having this problem and changed the shard key to be unique and it resolved the problem. That may not work for everyone. While diagnosing the problem I turned the verbosity level to 5 and was able to capture the error condition in the log. There were two threads doing an upsert on the same entry that did not exist. The first thread (conn20) determines that the entry does not exist but instead of committing the transaction it does a rollback. A second thread (conn17) does the same action and does a commit. After a slight delay, the first thread then creates a new transaction but does not check to see if an entry already exists and thus it results in a duplicate entry. I can't copy and paste from my log as it is on a isolated network, but this is an abbreviated version: [conn20] Winning plan: ... | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 07/Apr/16 ] | |||||||||||||||||||||||||
|
We rewrote every single upsert in our application to use a global Redis lock based on the shard key, which is clearly undesirable, but I think it's a more complete solution than the retry. The retry situation works only if you have a unique indexes on your insertion/query predicate, but if you don't have a unique index, you can still insert duplicate data when this scenario happens. Chances are if you're using an upsert you can probably solve via unique index, but not always. | |||||||||||||||||||||||||
| Comment by Daniel Barreto [ 07/Apr/16 ] | |||||||||||||||||||||||||
|
We just upgraded to WiredTiger (3.2.3) and this issue is happening a lot. We are loosing data because of this and will have to change our application logic to add retries there (really annoying since "upsert" should not be complaining about duplicate entries (E11000). Please let us know if there is any other workaround. | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 11/Jan/16 ] | |||||||||||||||||||||||||
|
Fantastic, thanks. | |||||||||||||||||||||||||
| Comment by Asya Kamsky [ 11/Jan/16 ] | |||||||||||||||||||||||||
|
Confirmed - as of 2.2 you can use for your shard key a prefix of an index, so having index on customer_id, event, date (whether unique or non-unique) means you are not required to have one on just customer_id, event (your shard key). | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 10/Jan/16 ] | |||||||||||||||||||||||||
|
If our shard key is { customer_id, event }(non-unique), I didn't realize we could add a unique index for { customer_id, event, date }. We definitely have that index, it's just not unique, so we could change it to that then if it's possible. | |||||||||||||||||||||||||
| Comment by Asya Kamsky [ 10/Jan/16 ] | |||||||||||||||||||||||||
|
Do you not have a unique index on customer_id, event, date? It seems like you should want to have such an index for performance (and you don't need to change the shard key here - in fact after creating the new index you can drop the index on customer_id,event) | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 09/Jan/16 ] | |||||||||||||||||||||||||
|
Thanks, Andy, it would be good to know for clarification. Here's one example. We have a collection that looks like this which tracks occurrences of an event over time: { customer_id: X, event: Y, date: Z, num_occurrences: W }that is sharded on { customer_id: 1, event: 1 }, and we update it with an upsert that $inc's num_occurrences on a given date. We did this so queries for a given { customer_id, event }pair over a date range would be on the same shard for query isolation. We had figured that Mongo's upsert was atomic when we made the design decision years ago. Now I'm concerned that this will cause two documents for a given { customer_id, event, date }pair. While we could go back and update the shard key, the collection has of hundreds of millions of documents in it now, is hundreds of gigabytes, so that would be unrealistic without a significant amount of downtime. Also it would alter query isolation. As such, I'll be updating our code to manually handle the synchronization of the insert for this collection. | |||||||||||||||||||||||||
| Comment by Andy Schwerin [ 09/Jan/16 ] | |||||||||||||||||||||||||
|
jonhyman, I have a minor clarification here. In a sharded collection, an update with upsert must always contain the document's shard key, since in the event that the upsert becomes an insert the shard key must be known. Any index that has the shard key as its prefix can also enforce a unique index constraint in mongodb, IIRC. I just looked, and think the documentation may be incorrect on this point. If you're interested, I can do a little digging. Anyhow, if you combine those two facts, you have a situation where you can always have a unique index when using upsert, unless you really cannot have a unique index on the search criteria due to the application design. However, if your application cannot accept a uniqueness constraint on your upsert criteria, in what sense is it incorrect to insert a second document? That is, if it is legal for two documents to match the criteria, how should mongodb decide when to insert a new document or not? This ticket is really more about automatically performing the upsert as an update on unique key violation, rather than requiring the client to do so. Prior to MongoDB 3.0, this situation was somewhat rare, but starting in 3.0, when using the WiredTiger storage engine, it has become more frequent, making it more of a hassle for app developers than it used to be. Even the situation in this ticket is a little tricky, because the automatic retry on duplicate key is based on the assumption that the violated constraint is the unique index constraint from the match criteria, and not some other unique index that might exist on the document. Handling a situation in which there may be multiple unique indexes correctly is what makes this harder to do in MongoDB than in the application, which knows what's what. | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 08/Jan/16 ] | |||||||||||||||||||||||||
|
As a workaround, we're going to use Redis for locking around all Mongo upserts to add the missing atomicity. While it's easy to code for retrying when you get a primary key error, that upserts can cause multiple document insertions where there are not unique indexes is pretty awful. Sharded collections can only have one unique index, so you're pretty limited with your options here once you shard. | |||||||||||||||||||||||||
| Comment by Tim Widmann [ 08/Jan/16 ] | |||||||||||||||||||||||||
|
I also got this with MongoDB 3.0.6 and WiredTiger | |||||||||||||||||||||||||
| Comment by Ryan [ 08/Jan/16 ] | |||||||||||||||||||||||||
|
I just hit this on mongodb-linux-x86_64-rhel70-3.2.0 with wiredTiger. | |||||||||||||||||||||||||
| Comment by Jon Hyman [ 08/Jan/16 ] | |||||||||||||||||||||||||
|
We just got this on 3.0.8. Had never seen it before (though I was on 2.6.11 previously). | |||||||||||||||||||||||||
| Comment by 아나 하리 [ 04/Jan/16 ] | |||||||||||||||||||||||||
|
It's okay until MongoDB 3.0.3, But from 3.0.4 to 3.0.6 (This is the last version I have tested) Duplicated Key error is happened. | |||||||||||||||||||||||||
| Comment by Dnv.IytH [ 30/Dec/15 ] | |||||||||||||||||||||||||
|
Which version of this problem begin to appear, it works well with wiredTiger engine in version 3.0.0. | |||||||||||||||||||||||||
| Comment by 아나 하리 [ 21/Nov/15 ] | |||||||||||||||||||||||||
|
This E11000(Duplicate key) exception during upsert operation happens from MongoDB 3.0.4 https://github.com/mongodb/mongo/commit/bfdf548bc4c882ab944e0cf2f1e6015752fc05c2 I think MONGO_WRITE_CONFLICT_RETRY_LOOP_BEGIN/END macro for "UpdateStage::doInsert()" is useless during "upsert" operation. And "UpdateStage::work(WorkingSetID* out)" has retry code already. | |||||||||||||||||||||||||
| Comment by Andrew Doumaux [ 09/Sep/15 ] | |||||||||||||||||||||||||
|
I'm receiving this same error in v3.0.6/wiredtiger engine when upserting into a collection that only has a index on _id and the upsert query is only on _id. This is using the Java Driver mongo 2.13 driver. I'm currently using Apache Camel Split to handle the multithread processing of records that are upserted into mongo. | |||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 19/Aug/15 ] | |||||||||||||||||||||||||
|
Simple shell repro for this problem, courtesy of igor:
| |||||||||||||||||||||||||
| Comment by Michael Kania [ 19/Aug/15 ] | |||||||||||||||||||||||||
|
Seems like lots of reports. We see this issue being pretty common 3.0 WireTiger and RocksDB. What's the plan with this ticket? |