[SERVER-1094] mongostat completly locks under high load Created: 06/May/10 Updated: 24/May/10 Resolved: 24/May/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Admin |
| Affects Version/s: | 1.4.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ryan Nitz | Assignee: | Eliot Horowitz (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux FC8 |
||
| Participants: |
| Description |
|
When the MongoDB has a high load and lots of connections, mongostat locks after connecting. Accessing the shell still works. This isn't the incremental lock I saw earlier, the command does not return any data (just hangs). Estimated load: Connections: 2k |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 24/May/10 ] |
|
if you still want to preheat, can you make a separate case for that |
| Comment by Ryan Nitz [ 12/May/10 ] |
|
Ignore my comment about the mongo CPU spiking when adding connections. There was a bug in our app and we were loading a lot of data on app startup. |
| Comment by Ryan Nitz [ 07/May/10 ] |
|
Ok... after more investigation, I know what happened. I did a restore on a database (moved to another machine) I ran a db.repairDatabase() I restarted Mongo At this point the server load average was ~ 7 I added the load above. When initializing a lot of rapid new connections, the Mongo CPU consumption spikes a bit. The 60 CPUs started interacting with Mongo. Mongo did not have most of data in core, resulting in a lot of disk access. The server load average jumped to ~ 12 mongostat started hanging on connect (strange part is that the shell was still working well) The good news is that I ran the same test with 80 CPUs (db was already all in core) and everything was perfect. Feature request: The ability to preheat data on startup (e.g., load X bytes worth of data from collections x,y and z before accepting connections). |
| Comment by Ryan Nitz [ 06/May/10 ] |
|
Well... I am just telling you what I saw on 1.4.2 on an xl EC2 instance (FC8). FYI - After I reduced the load, the command was working again. I am going to run some tests again later in the day. I'll attach the info after the tests. |
| Comment by Eliot Horowitz (Inactive) [ 06/May/10 ] |
|
We've seen it used under much higher load without a problem. |