[SERVER-32423] mongod crash Created: 19/Dec/17  Updated: 27/Mar/22  Resolved: 20/Dec/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jackson yu Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux version 3.10.0-327.36.1.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) )


Issue Links:
Duplicate
is duplicated by SERVER-43791 Secondary Replica Crushed with error ... Closed
Participants:

 Description   

2017-12-19T16:28:34.803+0800 E - [conn5] cannot open /dev/urandom Too many open files
2017-12-19T16:28:34.803+0800 I - [conn5] Fatal Assertion 28839
2017-12-19T16:28:34.803+0800 I - [conn5]

***aborting after fassert() failure

2017-12-19T16:28:34.867+0800 F - [conn5] Got signal: 6 (Aborted).



 Comments   
Comment by Fubang Li [ 27/Mar/22 ]

This is in the production environment, the QPS of it can be up to several ten thousands. Please help to give me a safety solution to avoid this type crash. Thanks very much. Before fix this issue, I need to watch it every seconds. I can't go to sleep now.

Comment by Fubang Li [ 27/Mar/22 ]

@Agarunov 

Comment by Fubang Li [ 27/Mar/22 ]

2 of 3 nodes went crashed in a 3 nodes mongo replica-set, fortunately, there were not down in the same time. 

The mongod version is: 3.2.22

Comment by Fubang Li [ 27/Mar/22 ]

I hit this issue yesterday.

```

root@mongomaster:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 499385
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 499385
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

```

The error msg:

```

2022-03-26T14:05:43.870+0800 I NETWORK [initandlisten] connection accepted from 172.17.89.165:37068 #1686 (434 connections now open)
2022-03-26T14:05:43.872+0800 E - [conn1686] cannot open /dev/urandom Too many open files
2022-03-26T14:05:43.872+0800 I - [conn1686] Fatal Assertion 28839
2022-03-26T14:05:43.872+0800 I - [conn1686]

***aborting after fassert() failure

2022-03-26T14:05:43.881+0800 F - [conn1686] Got signal: 6 (Aborted).

0x155c5e2 0x155b589 0x155bdf2 0x7feacb7f2390 0x7feacb44c438 0x7feacb44e03a 0x14d85f3 0x12ef2c2 0xb0b7df 0xb0c3dd 0xae1eb7 0xaff9b4 0xb01a7e0xc7af56 0xc7c24b 0xb8727b 0xdb555a 0xdb8a66 0x9c7730 0x1502451 0x7feacb7e86ba 0x7feacb51e51d
----- BEGIN BACKTRACE -----

{"backtrace":[\{"b":"400000","o":"115C5E2","s":"_ZN5mongo15printStackTraceERSo"}

,{"b":"400000","o":"115B589"},{"b":"400000","o":"115BDF2"},{"b":"7FEACB7E1000","o":"11390"},{"b":"7FEACB417000","o":"35438","s":"gsignal"},{"b":"7FEACB417000","o":"3703A","s":"abort"},{"b":"400000","o":"10D85F3","s":"ZN5mongo13fassertFailedEi"},{"b":"400000","o":"EEF2C2","s":"_ZN5mongo12SecureRandom6createEv"},{"b":"400000","o":"70B7DF","s":"_ZN5mongo31SaslSCRAMSHA1ServerConversation10_firstStepERSt6vectorINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EEPS7"},{"b":"400000","o":"70C3DD","s":"ZN5mongo31SaslSCRAMSHA1ServerConversation4stepENS_10StringDataEPNSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},{"b":"400000","o":"6E1EB7","s":"_ZN5mongo31NativeSaslAuthenticationSession4stepENS_10StringDataEPNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},{"b":"400000","o":"6FF9B4"},{"b":"400000","o":"701A7E"},{"b":"400000","o":"87AF56","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{"b":"400000","o":"87C24B","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"400000","o":"78727B","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"400000","o":"9B555A"},{"b":"400000","o":"9B8A66","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"400000","o":"5C7730"},{"b":"400000","o":"1102451","s":"_ZN5mongo17PortMessageServer17handleIncomingMsgEPv"},{"b":"7FEACB7E1000","o":"76BA"},{"b":"7FEACB417000","o":"10751D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.22", "gitVersion" : "105acca0d443f9a47c1a5bd608fd7133840a58dd", "compiledModules" : [], "uname" :

{ "sysname" : "Linux", "release" : "4.4.0-210-generic", "version" : "#242-Ubuntu SMP Fri Apr 16 09:57:56 UTC 2021", "machine" : "x86_64" }

, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "C2070FF92CF0E7C7AF25D84027F691037262CEA2" }, { "b" : "7FFE173FA000", "elfType" :3, "buildId" : "F9AD0E333B550914FB245B1251D85AFA20523DC0" }, { "b" : "7FEACC76E000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "AC6EB239181BE92EF90A74D344006276841F1102" }, { "b" : "7FEACC329000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "819406AC9B59B46936A823F9F96A3E55E5930EE8" }, { "b" : "7FEACC121000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "BB404D52807964CCC7F0815BC2666688A74B958F" }, { "b" : "7FEACBF1D000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "BBA6A2E958188C44B9BDA990278EBE8868B85379" }, { "b" : "7FEACBC14000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "E178A25E6DB28598588C03D898E44FD79BD16E4D" }, { "b" : "7FEACB9FE000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }, { "b" : "7FEACB7E1000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "C557B8146E8079AF46310B549DE6912D1FC4EA86" }, { "b" : "7FEACB417000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30773BE8CF5BFED9D910C8473DD44EAAB2E705AB" }, { "b" : "7FEACC9D6000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "98D7BC4313D0D8D5E127E06ACF2319829C5CE61D" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x155c5e2]
mongod(+0x115B589) [0x155b589]
mongod(+0x115BDF2) [0x155bdf2]
libpthread.so.0(+0x11390) [0x7feacb7f2390]
libc.so.6(gsignal+0x38) [0x7feacb44c438]
libc.so.6(abort+0x16A) [0x7feacb44e03a]
mongod(_ZN5mongo13fassertFailedEi+0x93) [0x14d85f3]
mongod(_ZN5mongo12SecureRandom6createEv+0x2E2) [0x12ef2c2]
mongod(ZN5mongo31SaslSCRAMSHA1ServerConversation10_firstStepERSt6vectorINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EEPS7+0x1E7F) [0xb0b7df]
mongod(ZN5mongo31SaslSCRAMSHA1ServerConversation4stepENS_10StringDataEPNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x37D) [0xb0c3dd]
mongod(ZN5mongo31NativeSaslAuthenticationSession4stepENS_10StringDataEPNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x27) [0xae1eb7]
mongod(+0x6FF9B4) [0xaff9b4]
mongod(+0x701A7E) [0xb01a7e]
mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x676) [0xc7af56]
mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0x85B) [0xc7c24b]
mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x25B) [0xb8727b]
mongod(+0x9B555A) [0xdb555a]
mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x7D6) [0xdb8a66]
mongod(+0x5C7730) [0x9c7730]
mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x311) [0x1502451]
libpthread.so.0(+0x76BA) [0x7feacb7e86ba]
libc.so.6(clone+0x6D) [0x7feacb51e51d]
----- END BACKTRACE -----

```

Comment by Mark Agarunov [ 20/Dec/17 ]

Hello 672042564@qq.com,

Thank you for the report. Looking at the output provided it appears that you've hit the open file limit on your system:

2017-12-19T16:28:34.803+0800 E -        [conn5] cannot open /dev/urandom Too many open files

This can be increased to a larger number of files to avoid hitting this error.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group.

Thanks,
Mark

Comment by Jackson yu [ 19/Dec/17 ]

anyone encounters this crash? thank u

Generated at Thu Feb 08 04:30:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.