[SERVER-3361] mongod crashes after starting with --master Created: 02/Jul/11 Updated: 29/Feb/12 Resolved: 31/Oct/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 1.8.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | James Simpson | Assignee: | Mathias Stearn |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | crash, master, replication | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
CentOS 5.5, 64-bit |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
I've been running MongoDB fine for the past week or so. I shut it down and tried to restart it with the following: ./mongod --dbpath /etc/mongodb/data --fork --logpath /var/log/ mongodb.log --logappend --quiet --journal --bind_ip 127.0.0.1 – nohttpinterface However, the mongod process quickly disappeared and I had to restart without --master to get it running again. This is what I got in my log (my database has less than 500MB of data since I've just been doing testing with it, and the box it is on has 8GB of RAM): Fri Jul 1 21:15:09 [initandlisten] waiting for connections on port 27017 Fri Jul 1 21:15:09 [FileAllocator] allocating new datafile /etc/mongodb/data/local.ns, filling with zeroes... |
| Comments |
| Comment by Ian Whalen (Inactive) [ 31/Oct/11 ] |
|
@james, I'm closing this as Cannot Reproduce. Please do reopen if you're continuing to experience these mongod crashes. |
| Comment by Mathias Stearn [ 19/Aug/11 ] |
|
Are you still having trouble? |
| Comment by Mathias Stearn [ 10/Aug/11 ] |
|
Do you know how many connections there were to that server? We've seen issues where linux allocates a full 8MB per connection even if we only use a few KB. If there are a lot of connections I'd suggest running "ulimt -s 1024" to reduce the stack size to 1MB. We do this automatically in 1.9/2.0 but it must be done manually in 1.8.2. |
| Comment by James Simpson [ 05/Jul/11 ] |
|
overcommit_memory is 0 and overcommit_ratio is 50. This is running on a VDS, so it is possible that all the space shows as virtual space. |
| Comment by Eliot Horowitz (Inactive) [ 05/Jul/11 ] |
|
Its 7gb of physical free, but your virtual space if highly used, and the os is probably not willing to overcommit. Its a little hard to tell without seeing it as its crashing. Can you add some swap space to test? Also, can you send the content of: /proc/sys/vm/overcommit_memory |
| Comment by James Simpson [ 05/Jul/11 ] |
|
What makes you say that? 7GB of free RAM isn't enough for MongoDB? |
| Comment by Eliot Horowitz (Inactive) [ 05/Jul/11 ] |
|
It looks like there isn't enough ram for all the things running on the box. |
| Comment by James Simpson [ 05/Jul/11 ] |
|
Here are the results from top (nothing showed in mongostat because it could never connect since mongod crashed after the above error happens): top - 08:20:14 up 1 day, 18:57, 3 users, load average: 0.25, 0.10, 0.06 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND |
| Comment by Eliot Horowitz (Inactive) [ 05/Jul/11 ] |
|
Yes - trying to see if something else is using ram, if there is a leak, etc... |
| Comment by James Simpson [ 03/Jul/11 ] |
|
In the original details I state that the box has 8GB of RAM (the database itself is less than 500MB). It also has 480GB of free disk space. Is there anything else you are looking for? |
| Comment by Eliot Horowitz (Inactive) [ 03/Jul/11 ] |
|
Can you send the stats for the box? |
| Comment by James Simpson [ 02/Jul/11 ] |
|
Just to add some more information, I've checked the logs this morning (having run without master all night, and my logs are full of a similar error (which has never shown in the logs before). It is being input into the logs every 60 seconds. Sat Jul 2 10:04:03 [dur] lsn set 228767 |
| Comment by James Simpson [ 02/Jul/11 ] |
|
Yes, sorry. What I was trying to say was that I had been running it for the past week without --master, and this is what happens when I try to start it with --master. |
| Comment by Eliot Horowitz (Inactive) [ 02/Jul/11 ] |
|
Had you been running without --master before? |