[SERVER-3152] Segmentation fault after too many open files Created: 26/May/11  Updated: 23/Feb/17  Resolved: 23/Feb/17

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 1.6.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Pieter Ennes Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 10.4 LTS on Amazon EC2 large nodes on mdadm/lvm EBS disks


Operating System: Linux
Participants:

 Description   

The primary node of a replica set ran out of file descriptors, which was logged about:

root@m2:~# ulimit -n
1024

root@m2:~# grep "Too many open files" /var/log/mongodb/mongodb.log | wc -l
6574844

but in the end results in a segfault, displayed in the log's tail:

Thu May 26 20:27:35 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
[6M times]

Thu May 26 20:27:35 [conn1126823] Uncaught std::exception: boost::filesystem::basic_directory_iterator constructor: Too many open files: "/mnt/mongo/_tmp/esort.1306441653.672251719/", terminating
Thu May 26 20:27:35 dbexit:

Thu May 26 20:27:35 [conn1126823] shutdown: going to close listening sockets...
Thu May 26 20:27:35 [conn1126823] closing listening socket: 18
Thu May 26 20:27:35 [conn1126823] closing listening socket: 20
Thu May 26 20:27:35 [conn1126823] shutdown: going to flush oplog...
Thu May 26 20:27:35 [conn1126823] shutdown: going to close sockets...
Thu May 26 20:27:35 [conn1126823] shutdown: waiting for fs preallocator...
Thu May 26 20:27:35 [conn1126823] shutdown: closing all files...
Thu May 26 20:27:35 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files
Thu May 26 20:27:35 [conn8] end connection 127.0.0.1:55336
Thu May 26 20:27:35 [conn1127914] assertion 11600 interrupted at shutdown ns:synth.current query:{ query: {}, $snapshot: true }
Thu May 26 20:27:35 [conn1127914] query synth.current exception 1214ms
Thu May 26 20:27:35 [conn1127914] SocketException in connThread, closing client connection
Thu May 26 20:27:35 [conn7] end connection 127.0.0.1:55335
Thu May 26 20:27:35 ERROR: Client::shutdown not called: slaveTracking
Thu May 26 20:27:35 Got signal: 11 (Segmentation fault).

Thu May 26 20:27:35 [conn52293] end connection 10.254.238.86:54967
Thu May 26 20:27:35 Backtrace:
0x824629 0x7f3429c7eaf0 0x6e1e90 0x6e1efa 0x7247af 0x6c19d7 0x6bcdb2 0x6206cd 0x622b6c 0x705ba4 0x70adf2 0x70b494 0x5588a2 0x7a287c 0x797596 0x798538 0x5fb7e5 0x60029f 0x7074ba 0x70aaf6
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x824629]
/lib/libc.so.6(+0x33af0) [0x7f3429c7eaf0]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails6_allocEPKci+0) [0x6e1e90]
/usr/bin/mongod(_ZN5mongo16NamespaceDetails5allocEPKciRNS_7DiskLocE+0x3a) [0x6e1efa]
/usr/bin/mongod(_ZN5mongo11DataFileMgr17fast_oplog_insertEPNS_16NamespaceDetailsEPKci+0x17f) [0x7247af]
/usr/bin/mongod() [0x6c19d7]
/usr/bin/mongod(_ZN5mongo5logOpEPKcS1_RKNS_7BSONObjEPS2_Pb+0x42) [0x6bcdb2]
/usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES2_bbbRNS_7OpDebugEPNS_11RemoveSaverE+0x1dfd) [0x6206cd]
/usr/bin/mongod(_ZN5mongo13updateObjectsEPKcRKNS_7BSONObjES2_bbbRNS_7OpDebugE+0x11c) [0x622b6c]
/usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE+0x4d4) [0x705ba4]
/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x17d2) [0x70adf2]
/usr/bin/mongod(_ZN5mongo14DBDirectClient3sayERNS_7MessageE+0x64) [0x70b494]
/usr/bin/mongod(_ZN5mongo12DBClientBase6updateERKSsNS_5QueryENS_7BSONObjEbb+0x2a2) [0x5588a2]
/usr/bin/mongod(_ZN5mongo16CmdFindAndModify3runERKSsRNS_7BSONObjERSsRNS_14BSONObjBuilderEb+0x140c) [0x7a287c]
/usr/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xa16) [0x797596]
/usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x798) [0x798538]
/usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_10BufBuilderERNS_14BSONObjBuilderEbi+0x35) [0x5fb7e5]
/usr/bin/mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0x1bbf) [0x60029f]
/usr/bin/mongod() [0x7074ba]

/usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_8SockAddrE+0x14d6) [0x70aaf6]

Thu May 26 20:27:35 dbexit: ; exiting immediately

Thu May 26 20:27:35 ERROR: Client::~Client _context should be null but is not; client:conn
Thu May 26 20:27:35 [conn31] end connection 127.0.0.1:55337
Thu May 26 20:27:36 [conn50043] end connection 10.86.197.56:48503
Thu May 26 20:27:36 [conn430] end connection 10.212.71.69:40076
Thu May 26 20:27:37 [conn52518] end connection 10.198.107.95:58198
50/245 20%
88/245 35%
Thu May 26 20:27:39 Got signal: 11 (Segmentation fault).

Thu May 26 20:27:39 Backtrace:
0x824629 0x7f3429c7eaf0 0x52c2b5 0x701d57 0x702551 0x827a08 0x83a4b0 0x7f342a7829ca 0x7f3429d3170d
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x824629]
/lib/libc.so.6(+0x33af0) [0x7f3429c7eaf0]
/usr/bin/mongod(_ZN5mongo9MongoFile13closeAllFilesERSt18basic_stringstreamIcSt11char_traitsIcESaIcEE+0xa5) [0x52c2b5]
/usr/bin/mongod(_ZN5mongo8shutdownEv+0x3a7) [0x701d57]
/usr/bin/mongod(_ZN5mongo6dbexitENS_8ExitCodeEPKc+0x201) [0x702551]
/usr/bin/mongod(_ZN5mongo10connThreadEPNS_13MessagingPortE+0x13f8) [0x827a08]
/usr/bin/mongod(thread_proxy+0x80) [0x83a4b0]
/lib/libpthread.so.0(+0x69ca) [0x7f342a7829ca]
/lib/libc.so.6(clone+0x6d) [0x7f3429d3170d]

The node was not responding any more, at least from the first message:

Thu May 26 20:25:06 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files

to the last:

Thu May 26 20:27:35 [initandlisten] Listener: accept() returns -1 errno:24 Too many open files

but very likely already 10 minutes before; yet a fail-over in the replica set occurred only at the time of the segfault.

Do let me know if you need more information...



 Comments   
Comment by Shahaf Abileah [ 14/Aug/14 ]

I think I hit the same issue.

I'm running Mongo 2.4.8 on Mac, installed using homebrew.

/usr/local/etc> mongo --version
MongoDB shell version: 2.4.8
/usr/local/etc> mongod --version
db version v2.4.8
Thu Aug 14 12:20:43.739 git version: nogitversion

My mongod process shut down with Exit 14. I looked at my /usr/local/var/log/mongodb/mongo.log file and found the following:

Thu Aug 14 12:00:38.022 [journal] exception in dur::groupCommitLL causing immediate shutdown: boost::filesystem::current_path: Too many open files in system
Thu Aug 14 12:00:38.022 dur4
Thu Aug 14 12:00:38.022 Got signal: 6 (Abort trap: 6).

Thu Aug 14 12:00:38.067 Backtrace:
0x1088c7990 0x1083d822d 0x7fff8e6765aa 0 0x7fff93d27b1a 0x10860a353 0x1085829d5 0x108582dae 0x1088ff8c1 0x7fff8f573899 0x7fff8f57372a 0x7fff8f577fc9
0 mongod 0x00000001088c7990 _ZN5mongo15printStackTraceERSo + 64
1 mongod 0x00000001083d822d _ZN5mongo10abruptQuitEi + 397
2 libsystem_platform.dylib 0x00007fff8e6765aa _sigtramp + 26
3 ??? 0x0000000000000000 0x0 + 0
4 libsystem_c.dylib 0x00007fff93d27b1a abort + 125
5 mongod 0x000000010860a353 _ZN5mongo10mongoAbortEPKc + 99
6 mongod 0x00000001085829d5 _ZN5mongo3dur27groupCommitWithLimitedLocksEv + 1429
7 mongod 0x0000000108582dae _ZN5mongo3dur9durThreadEv + 622
8 mongod 0x00000001088ff8c1 thread_proxy + 177
9 libsystem_pthread.dylib 0x00007fff8f573899 _pthread_body + 138
10 libsystem_pthread.dylib 0x00007fff8f57372a _pthread_struct_init + 0
11 libsystem_pthread.dylib 0x00007fff8f577fc9 thread_start + 13

Generated at Thu Feb 08 03:02:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.