[SERVER-71246] Excessive CPU consumption of newly 5.0 upgraded instance Created: 10/Nov/22 Updated: 11/Nov/22 Resolved: 10/Nov/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Vladimir Beliakov | Assignee: | Yuan Fang |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 18.04.6 LTS |
||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
We're in the process of upgrading our MongoDB cluster from 4.4.15 to 5.0.10. {"t": {"$date":"2022-11-07T09:08:24.320+00:00"},"s":"W", "c":"COMMAND", "id":23802, "ctx":"conn3798","msg":"Plan executor error during findAndModify","attr":{"error": {"code":112,"codeName":"WriteConflict","errmsg":"WriteConflict error: this operation conflicted with another operation. Please retry your operation or multi-document transaction."},"stats":{"stage":"UPDATE","nReturned":0,"executionTimeMillisEstimate":1,"works":14,"advanced":0,"needTime":12,"needYield":1,"saveState":1,"restoreState":1,"failed":true,"isEOF":1,"nMatched":0,"nWouldModify":0,"nWouldUpsert":0,"inputStage":{"stage":"LIMIT","nReturned":1,"executionTimeMillisEstimate":1,"works":13,"advanced":1,"needTime":12,"needYield":0,"saveState":2,"restoreState":1,"isEOF":1,"limitAmount":1,"inputStage":{"stage":"FETCH","nReturned":1,"executionTimeMillisEstimate":1,"works":13,"advanced":1,"needTime":12,"needYield":0,"saveState":2,"restoreState":1,"isEOF":0,"docsExamined":1,"alreadyHasObj":0,"inputStage":{"stage":"SORT_MERGE","nReturned":1,"executionTimeMillisEstimate":1,"works":13,"advanced":1,"needTime":12,"needYield":0,"saveState":2,"restoreState":1,"isEOF":0,"sortPattern":{" {REDACTED}":1,"{REDACTED}":1},"dupsTested":1,"dupsDropped":0,"inputStages":[{"stage":"IXSCAN","nReturned":1,"executionTimeMillisEstimate":0,"works":1,"advanced":1,"needTime":0,"needYield":0,"saveState":2,"restoreState":1,"isEOF":0,"keyPattern": {"REDACTED"} ,"indexName":" {REDACTED} ","isMultiKey":false,"multiKeyPaths":{" {REDACTED}":[]," {REDACTED} ":[],"{REDACTED} ":[]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{" {REDACTED}":["[0, 0]"],"{REDACTED}":["[MinKey, MaxKey]"]," {REDACTED} ":["[new Date(-9223372036854775808), new Date(1667812104306))"]},"keysExamined":1,"seeks":1,"dupsTested":0,"dupsDropped":0},{"stage":"IXSCAN","nReturned":0,"executionTimeMillisEstimate":1,"works":11,"advanced":0,"needTime":10,"needYield":0,"saveState":2,"restoreState":1,"isEOF":1,"keyPattern":{" {REDACTED}":1," {REDACTED} ":1,"{REDACTED} ":1},"indexName":" {REDACTED}","isMultiKey":false,"multiKeyPaths":{"{REDACTED}":[]," {REDACTED}":[],"{REDACTED}":[]},"isUnique":false,"isSparse":false,"isPartial":false,"indexVersion":2,"direction":"forward","indexBounds":{" {REDACTED} ":["[1, 1]"],"{REDACTED} ":["[MinKey, MaxKey]"]," {REDACTED}":["[new Date(-9223372036854775808), new Date(1667812104306))"]},"keysExamined":11,"seeks":11,"dupsTested":0,"dupsDropped":0}]}}}},"cmd":{"findAndModify":"{REDACTED}","query":{" {REDACTED}":{"$in":[0,1]},"{REDACTED}":{"$lt": {"$date":"2022-11-07T09:08:24.306Z"} }},"sort":{" {REDACTED} ":1,"{REDACTED} ":1},"update":{"$set":{" {REDACTED}":1,"{REDACTED}": {"$date":"2022-11-07T09:19:54.305Z"} ," {REDACTED} ":{"$binary":{"base64":"hW7CcBc5Uki6JVK/hvU5ew==","subType":"3"}},"{REDACTED} ":{"$date":"2022-11-07T09:08:24.305Z"}},"$unset":{" {REDACTED}":1},"$inc":{"{REDACTED}":1}},"runtimeConstants":{"localNow": {"$date":"2022-11-07T09:08:24.305Z"},"clusterTime":{"$timestamp": {"t":1667812104,"i":2118}}},"writeConcern":{"w":1,"wtimeout":0,"provenance":"implicitDefault"}}}}
We're urgently need help, because this is a show stopper for our MongoDB cluster optimization. Please let me know if any additional information is required. |
| Comments |
| Comment by Vladimir Beliakov [ 11/Nov/22 ] |
|
Hi Yuan Fang, Thank you for looking into this! |
| Comment by Yuan Fang [ 10/Nov/22 ] |
|
Thank you for your report. With regards to your question:
By looking at the diagnostic data you provided. At point A (the point upgrade to 5.0) through point B, I can see spikes of update, which seem correlated to write conflicts spikes. The write conflicts and update went down to 0 after point B, meanwhile, the CPU utilization also went down. Therefore, the high CPU utilization after the upgrade is likely caused by an increase in update operations. As for what causes the occurrence of write conflicts, I don't have a clear answer, usually, it is due to multiple clients trying to update one document at the same time. I would check the oplog to find more information if write conflicts occur again. For this issue, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. |
| Comment by Vladimir Beliakov [ 10/Nov/22 ] |
|
Sorry, I forgot to mention that the incident (when we switched the primary to the 5.0 upgraded mongod instance) was happening from 2022-11-10 09:03:00 to 2022-11-10 09:07:00. |