[SERVER-1603] mongod crashes with out of memory error during restore Created: 18/Aug/10  Updated: 16/Jan/11  Resolved: 16/Jan/11

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 1.6.0, 1.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Roger Bodamer (Inactive) Assignee: Roger Bodamer (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux mongodb 2.6.34.2-dotcloud-ec2 #1 SMP Tue Aug 3 12:04:10 PDT 2010 x86_64 GNU/Linux


Attachments: Text File mongorestore.txt     Text File mongostat.txt     Text File term.txt    
Operating System: Linux
Participants:

 Description   

Scenario:

$ /etc/init.d/mongodb stop
$ rm -f /var/lib/mongodb/*
$ /etc/init.d/mongodb start
$ mongorestore dump/

During restore, mongod is killed with an out of memory error. The mongostat output is attached to the bug as well as the mongorestore output.

Roger has account information a machine where the bug can be reproduced.



 Comments   
Comment by Roger Bodamer (Inactive) [ 19/Aug/10 ]

from Aaron:

Possibly it's related to "external sort" which is an offline sort of keys to index. I'm just guessing because I'm seeing a message about external sort shortly before the out of mem issue, and there are several lines in the external sort code where you can print mem info certain debug modes (I didn't write the code). I'm doing another run with more logging.

I enabled the extra logging, but unfortunately it appeared in the terminal window rather than the log file. I'm attaching the terminal output. The corresponding mongod log is in /var/log/mongodb/mongodb.log, the last run started before noon.

I don't know whether or not the external sort is relevant in running out of memory, but logging enabled for the external sort does give an indication of overall memory usage increasing as the mongod continues running.

Also, I modified /etc/mongodb.conf to put files in /var/lib/mongodb2 which was an empty dbpath I could use for testing. I think /var/lib/mongodb has the original data files.

I was unable to reproduce the issue running with the data files already present in /var/lib/mongodb. I could reproduce using an empty /var/lib/mongodb2

Comment by Roger Bodamer (Inactive) [ 18/Aug/10 ]

Customer Comments:

We are experiencing OOM crashes with mongodb 1.6.0. We can reproduce the crash
by repairing the database.

We have tried various things:

  • adding large quantity of swap (> 20 GB);
  • set /proc/sys/vm/overcommit_memory to 1 or 2 instead of the default (0);
  • disable oom on the the mongodb process.

MongoDB also crashed by memory exhaustion when we tried:

  • to use master/slave replication (the slave crashed);
  • to mongorestore from both a datadir and a dump.

Nothing we tried, except using a machine with 30GB of RAM, prevented mongodb to
get OOMed by the linux kernel.

The crash occurs on our biggest collection in our database. This collection is
about 8 GB large.

Please find attached the mongodb log, the mongostat output and the log of my
mongo shell session when reproducing the problem attached with this mail.

The machine I use to reproduce the problem is a m1.large instance on Amazon
EC2. This machine have a dual core CPU with 7.5GB of RAM and I have added 20GB
of swap.

The machine run Linux 2.6.34 and Ubuntu 10.04, we installed mongodb-stable from
the 10gen repositories.

I can provide ssh root access to this machine where you will be able to debug
at will.

Generated at Thu Feb 08 02:57:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.