[SERVER-16295] Do not count duplicate index exception (E11000) as user-assertion Created: 24/Nov/14 Updated: 19/Dec/18 Resolved: 22/May/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 2.6.0, 2.6.5 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Erez Lirov | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
We're trying to use MongoDB as a global lock across multiple VMs. We've created a lock collection with a unique index on the type of lock, and we lock by trying to insert a record with a certain lock type value. If there's a duplicate exception, we know it's already locked. Otherwise, we know we're locked. The problem now that we upgraded to 2.6.5 from 2.4 is that mongo sees the duplicate key exception as a user error and is logging it into our log files and counting it internally as a user error. So, our monitoring software keeps flagging this as an exception, whereas it's really the way the system was designed to run. |
| Comments |
| Comment by Erez Lirov [ 19/Dec/18 ] | |||||||||||||||||||||||||||||||||||||||
|
I guess this was closed a while ago, but it's still very much an issue. The question is whether Duplicate Index Exceptions are actually an ERROR condition. In many cases, they are NOT errors at all, but the way the application is expected to implement global locks or many other concurrency checks. Under the current behavior of mongo, there's no way to differentiate between true errors and "false positives". So, we're getting paged by our monitoring systems and when there's nothing to fix. If we could configure our server to NOT increment user asserts whenever duplicate key exceptions are encountered, that would fix this issue (probably for lots of users.) | |||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 12/Jan/15 ] | |||||||||||||||||||||||||||||||||||||||
|
Thanks for the additional information elirov. I'm re-purposing this ticket as an improvement suggestion. Regards, | |||||||||||||||||||||||||||||||||||||||
| Comment by Erez Lirov [ 10/Jan/15 ] | |||||||||||||||||||||||||||||||||||||||
|
There's really no "operational" difference that we're seeing between 2.4 and 2.6. It's all about the monitoring (for example, in LogicMonitor) that's complaining about too many user asserts against our production mongo database. This is an important monitor because it lets us know if we have bad code in our application that is causing mongo exceptions. The problem is that if E11000 creates false positives (i.e., complaining about a normal operation like trying to insert duplicate rows in a unique index) then that creates noise that clutters our monitor and prevents us from finding application errors as efficiently. Since inserting against a unique index and expecting a failure is really the only way to implement things like distributed locks, this should be considered a normal operation and mongo should not raise any exceptions or add any asserts to its internal counters. At the very least, if we had a way to configure which exceptions would be ignored by the assert counter and logger, that would be great. Thanks! | |||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 08/Jan/15 ] | |||||||||||||||||||||||||||||||||||||||
|
elirov, we haven't heard back from your for a while. If this is still an issue for you, can you please elaborate on the behavior you were seeing with 2.4.x and what you're seeing with 2.6.5 that's different? Also, could it be that the additional error information returned by 2.6 is what's causing the issue to begin with? Thanks, | |||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 01/Dec/14 ] | |||||||||||||||||||||||||||||||||||||||
|
If I understand correctly, your application was working with 2.4 despite these user assertions, so I wonder if there's some other behavior change between 2.4 and 2.6 at play here, and whether it's a regression or was an intended change. Can you please elaborate on the behavior you were seeing with 2.4.x and what you're seeing with 2.6.5 that's different? Unless we find some behavior change I think this ticket may get resolved as "Works as designed", since the user asserts haven't changed between 2.4 and 2.6 as far as I can tell. If we can't find a bug here I'd suggest you post on the mongodb-user group or Stack Overflow with the mongodb tag, as the SERVER project is for reporting bugs and feature suggestions for the MongoDB server. A question in the mongodb-user group will reach a larger audience. | |||||||||||||||||||||||||||||||||||||||
| Comment by Erez Lirov [ 27/Nov/14 ] | |||||||||||||||||||||||||||||||||||||||
|
Does it make sense to NOT increment asserts.user at all when hitting the duplicate index exception? It doesn't seem like it should be treated as an exception condition, since it's the only way to implement distributed locks reliably using Mongo. | |||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 25/Nov/14 ] | |||||||||||||||||||||||||||||||||||||||
|
I did find a slight difference in behavior between 2.4.5 and 2.6.5:
As you can see, in 2.6.5 the user assertions are incremented by 2 and in 2.4.5 they're incremented only by one. So while you'll see this number climb up twice as fast in 2.6.5, it is also incremented in 2.4.5 when attempting to insert a duplicate key. You'll also find that 2.4.5 and 2.6.5 exhibit the same logging behavior at loglevel 0 (neither logs the error) and loglevel 1 (both log the error), so I'm still not sure what's the exact behavior change that's impacting your application. If the loglevel was changed from 0 to 1 that could be the source of the additional logging. About the loglevel, in the example below I check it and the server tells me it's 1, then I set it to 0 and check that the change was successful:
| |||||||||||||||||||||||||||||||||||||||
| Comment by Erez Lirov [ 25/Nov/14 ] | |||||||||||||||||||||||||||||||||||||||
|
We were on 2.4.5, I believe. I'm less concerned about the log file, and more concerned about the asserts.user count inside Mongo that we get from requesting a status: "asserts": { "regular": 0, "warning": 0, "msg": 0, "user": 3500586, "rollovers": 0 }, It seems that E11000 increments the asserts.user count. | |||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 25/Nov/14 ] | |||||||||||||||||||||||||||||||||||||||
|
elirov, what's the 2.4 version you were running before? Are you running with logLevel set to 1 or higher? I tried a few 2.4 versions as well as 2.6.5, and at logLevel 0 a duplicate key error never shows up in the logs. Can you please be more specific about the behavior you were seeing with 2.4.x and how is it different than 2.6.5? |