[SERVER-20927] 100% CPU on mongo 3.0.4 Created: 14/Oct/15 Updated: 09/Jan/16 Resolved: 09/Jan/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Admin |
| Affects Version/s: | 3.0.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Mike Bartlett | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Steps To Reproduce: | Run mongo for a while, then it breaks, lose confidence. |
| Participants: |
| Description |
|
Hi folks, We previously commented on https://jira.mongodb.org/browse/SERVER-19485 and realise this may be fixed in subsequent versions, but thought it wise to report it nontheless and attach ss.log. I didn't run it very long as I had to stepDown the server as this is an operating production environment and things were pretty borked during the 100% cpu spike. As soon as the stepDown occurred, the CPU dropped back down to normal operating parameters. |
| Comments |
| Comment by Ramon Fernandez Marina [ 23/Nov/15 ] |
|
mydigitalself, in the latest date you uploaded I see peaks of over 400K operations per second, and up to 11K connections created per seconds, so it could be that this server is not powerful enough to handle the load it's being subjected to. In addition, there have been numerous fixes since version 3.0.4, so if this is still an issue for you can you please try MongoDB 3.0.7 and let us know if the issue persists? Thanks, |
| Comment by Mike Bartlett [ 14/Oct/15 ] |
|
So it started happening on the new primary a few mins after election, but only very briefly. ss-10s.log should capture the 100% cpu i then noticed your instructions in the previous bug report for the iostat log and so dropped it down to 1s for the serverStatus and included the iostat.log but it appeared to have dropped down to regular CPU parameters either just before or perhaps during this period. |
| Comment by Mike Bartlett [ 14/Oct/15 ] |
|
What I'd really love to know is why the replicaset doesn't pick this up and re-elect, because clearly the primary is unhealthy in this state. |