[SERVER-25140] mongos crashes with fassert Created: 19/Jul/16  Updated: 13/Feb/18  Resolved: 21/Feb/17

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: 3.2.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Edik Mkoyan Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

2016-07-19T17:40:02.263+0400 I ACCESS   [conn20020] Successfully authenticated as principal dbadmin on yerevan
2016-07-19T17:40:02.521+0400 E -        [NetworkInterfaceASIO-TaskExecutorPool-2-0] cannot open /dev/urandom Too many open files
2016-07-19T17:40:02.521+0400 I -        [NetworkInterfaceASIO-TaskExecutorPool-2-0] Fatal Assertion 28839
2016-07-19T17:40:02.521+0400 I -        [NetworkInterfaceASIO-TaskExecutorPool-2-0] 
 
***aborting after fassert() failure
 
 
2016-07-19T17:40:02.558+0400 F -        [NetworkInterfaceASIO-TaskExecutorPool-2-0] Got signal: 6 (Aborted).
 
 0xc635e2 0xc62739 0xc62f42 0x7fca307df330 0x7fca30440c37 0x7fca30444028 0xbeb732 0xa40c73 0x7710ed 0x771ce2 0x74397b 0x768115 0x76c77b 0x6ff48c 0x701c4d 0x701e27 0xa1495c 0xa16089 0xa16cd0 0x9ec060 0x9fa50c 0x9fa9c8 0xc7f271 0xc7f491 0xc8362f 0xa0ed65 0xe961d0 0x7fca307d7184 0x7fca3050437d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"8635E2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"862739"},{"b":"400000","o":"862F42"},{"b":"7FCA307CF000","o":"10330"},{"b":"7FCA3040A000","o":"36C37","s":"gsignal"},{"b":"7FCA3040A000","o":"3A028","s":"abort"},{"b":"400000","o":"7EB732","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"640C73","s":"_ZN5mongo12SecureRandom6createEv"},{"b":"400000","o":"3710ED","s":"_ZN5mongo31SaslSCRAMSHA1ClientConversation10_firstStepEPSs"},{"b":"400000","o":"371CE2","s":"_ZN5mongo31SaslSCRAMSHA1ClientConversation4stepENS_10StringDataEPSs"},{"b":"400000","o":"34397B","s":"_ZN5mongo23NativeSaslClientSession4stepENS_10StringDataEPSs"},{"b":"400000","o":"368115"},{"b":"400000","o":"36C77B"},{"b":"400000","o":"2FF48C"},{"b":"400000","o":"301C4D"},{"b":"400000","o":"301E27","s":"_ZN5mongo4auth18authenticateClientERKNS_7BSONObjENS_10StringDataES4_St8functionIFvNS_8executor20RemoteCommandRequestES5_IFvNS_10StatusWithINS6_21RemoteCommandResponseEEEEEEESC_"},{"b":"400000","o":"61495C","s":"_ZN5mongo8executor20NetworkInterfaceASIO13_authenticateEPNS1_7AsyncOpE"},{"b":"400000","o":"616089"},{"b":"400000","o":"616CD0"},{"b":"400000","o":"5EC060","s":"_ZN4asio6detail14strand_service8dispatchINS0_7binder2IRSt8functionIFvSt10error_codemEES5_mEEEEvRPNS1_11strand_implERT_"},{"b":"400000","o":"5FA50C","s":"_ZN4asio6detail14strand_service8dispatchINS0_17rewrapped_handlerINS0_7binder2INS0_7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream_socket_serviceIS8_EEEENS_17mutable_buffers_1ENS0_14transfer_all_tENS0_15wrapped_handlerINS_10io_service6strandESt8functionIFvSt10error_codemEENS0_26is_continuation_if_runningEEEEESI_mEESK_EEEEvRPNS1_11strand_implERT_"},{"b":"400000","o":"5FA9C8","s":"_ZN4asio6detail23reactive_socket_recv_opINS_17mutable_buffers_1ENS0_7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream_socket_serviceIS6_EEEES2_NS0_14transfer_all_tENS0_15wrapped_handlerINS_10io_service6strandESt8functionIFvSt10error_codemEENS0_26is_continuation_if_runningEEEEEE11do_completeEPvPNS0_19scheduler_operationERKSF_m"},{"b":"400000","o":"87F271","s":"_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code"},{"b":"400000","o":"87F491","s":"_ZN4asio6detail9scheduler3runERSt10error_code"},{"b":"400000","o":"88362F","s":"_ZN4asio10io_service3runEv"},{"b":"400000","o":"60ED65"},{"b":"400000","o":"A961D0","s":"execute_native_thread_routine"},{"b":"7FCA307CF000","o":"8184"},{"b":"7FCA3040A000","o":"FA37D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.8", "gitVersion" : "ed70e33130c977bda0024c125b56d159573dbaf0", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.19.0-64-generic", "version" : "#72~14.04.1-Ubuntu SMP Fri Jun 24 17:59:48 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "9D770B2837066D506516D6E0519BF9092FA818FE" }, { "b" : "7FFCFC7E0000", "elfType" : 3, "buildId" : "C89BD46B7CFC47F3E55EF539B3FAF8E450562F6A" }, { "b" : "7FCA316F1000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "74864DB9D5F69D39A67E4755012FB6573C469B3D" }, { "b" : "7FCA31315000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "AAE7CFF8351B730830BDBCE0DCABBE06574B7144" }, { "b" : "7FCA3110D000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "E2A6DD5048A0A051FD61043BDB69D8CC68192AB7" }, { "b" : "7FCA30F09000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "DA9B8C234D0FE9FD8CAAC8970A7EC1B6C8F6623F" }, { "b" : "7FCA30C03000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "D144258E614900B255A31F3FD2283A878670D5BC" }, { "b" : "7FCA309ED000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7FCA307CF000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "31E9F21AE8C10396171F1E13DA15780986FA696C" }, { "b" : "7FCA3040A000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "CF699A15CAAE64F50311FC4655B86DC39A479789" }, { "b" : "7FCA31950000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "D0F537904076D73F29E4A37341F8A449E2EF6CD0" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x32) [0xc635e2]
 mongos(+0x862739) [0xc62739]
 mongos(+0x862F42) [0xc62f42]
 libpthread.so.0(+0x10330) [0x7fca307df330]
 libc.so.6(gsignal+0x37) [0x7fca30440c37]
 libc.so.6(abort+0x148) [0x7fca30444028]
 mongos(_ZN5mongo13fassertFailedEi+0x82) [0xbeb732]
 mongos(_ZN5mongo12SecureRandom6createEv+0x223) [0xa40c73]
 mongos(_ZN5mongo31SaslSCRAMSHA1ClientConversation10_firstStepEPSs+0x17D) [0x7710ed]
 mongos(_ZN5mongo31SaslSCRAMSHA1ClientConversation4stepENS_10StringDataEPSs+0x282) [0x771ce2]
 mongos(_ZN5mongo23NativeSaslClientSession4stepENS_10StringDataEPSs+0x2B) [0x74397b]
 mongos(+0x368115) [0x768115]
 mongos(+0x36C77B) [0x76c77b]
 mongos(+0x2FF48C) [0x6ff48c]
 mongos(+0x301C4D) [0x701c4d]
 mongos(_ZN5mongo4auth18authenticateClientERKNS_7BSONObjENS_10StringDataES4_St8functionIFvNS_8executor20RemoteCommandRequestES5_IFvNS_10StatusWithINS6_21RemoteCommandResponseEEEEEEESC_+0xD7) [0x701e27]
 mongos(_ZN5mongo8executor20NetworkInterfaceASIO13_authenticateEPNS1_7AsyncOpE+0x10C) [0xa1495c]
 mongos(+0x616089) [0xa16089]
 mongos(+0x616CD0) [0xa16cd0]
 mongos(_ZN4asio6detail14strand_service8dispatchINS0_7binder2IRSt8functionIFvSt10error_codemEES5_mEEEEvRPNS1_11strand_implERT_+0x70) [0x9ec060]
 mongos(_ZN4asio6detail14strand_service8dispatchINS0_17rewrapped_handlerINS0_7binder2INS0_7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream_socket_serviceIS8_EEEENS_17mutable_buffers_1ENS0_14transfer_all_tENS0_15wrapped_handlerINS_10io_service6strandESt8functionIFvSt10error_codemEENS0_26is_continuation_if_runningEEEEESI_mEESK_EEEEvRPNS1_11strand_implERT_+0x89C) [0x9fa50c]
 mongos(_ZN4asio6detail23reactive_socket_recv_opINS_17mutable_buffers_1ENS0_7read_opINS_19basic_stream_socketINS_2ip3tcpENS_21stream_socket_serviceIS6_EEEES2_NS0_14transfer_all_tENS0_15wrapped_handlerINS_10io_service6strandESt8functionIFvSt10error_codemEENS0_26is_continuation_if_runningEEEEEE11do_completeEPvPNS0_19scheduler_operationERKSF_m+0x228) [0x9fa9c8]
 mongos(_ZN4asio6detail9scheduler10do_run_oneERNS0_11scoped_lockINS0_11posix_mutexEEERNS0_21scheduler_thread_infoERKSt10error_code+0x2F1) [0xc7f271]
 mongos(_ZN4asio6detail9scheduler3runERSt10error_code+0xC1) [0xc7f491]
 mongos(_ZN4asio10io_service3runEv+0x2F) [0xc8362f]
 mongos(+0x60ED65) [0xa0ed65]
 mongos(execute_native_thread_routine+0x20) [0xe961d0]
 libpthread.so.0(+0x8184) [0x7fca307d7184]
 libc.so.6(clone+0x6D) [0x7fca3050437d]
-----  END BACKTRACE  -----



 Comments   
Comment by German Gutierrez [ 13/Feb/18 ]

Thanks Kelsey

i increase my ulimits to maximum of about to recomendations limits for mongodb and dont have more crashes in my BD. thanks for suporrting this topic

-f (file size): unlimited
-t (cpu time): unlimited
-v (virtual memory): unlimited [1]
-n (open files): 64000
-m (memory size): unlimited [1] [2]
-u (processes/threads): 64000

Comment by Kelsey Schubert [ 13/Feb/18 ]

Hi germao216@hotmail.com,

You may need to increase your ulimits to match your use case. Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Kind regards,
Kelsey

Comment by German Gutierrez [ 24/Jan/18 ]

Hi everybody

We had a crash for a high number of open files

[conn117728] Fatal Assertion 28839 at src/mongo/platform/random.cpp 158

mi ulimits of openfiles is in 64000

somebody know why crash MongoDB after of this number of open files

Comment by Kelsey Schubert [ 13/Feb/17 ]

Hi edikmkoyan,

Thanks for additional information. To resolve this issue, my recommendation would be to increase the open file limit. Please note that SERVER-25659 describes an improvement, which would likely improve the behavior you are observing. Feel free to vote for SERVER-25659 and watch it for updates.

Kind regards,
Thomas

Comment by Edik Mkoyan [ 13/Feb/17 ]

Hi Thomas Schubert have left open file limit as is, because I am sure I will never have more then 50 connections on a single mongos instance.
Other production notes are counted, I have disable numa, vmware memory bubbles, transparent huge pages, etc...
In fact for mongos we just use an execution command, that runs it under certain user with only cluster keyfile authentication flags.

Comment by Kelsey Schubert [ 13/Feb/17 ]

Hi edikmkoyan,

Would you please provide your upstart config so we can ensure that your limits are appropriately set?

Thank you,
Thomas

Comment by Edik Mkoyan [ 09/Feb/17 ]

Apologies for a long delay, we where using upstart.

Comment by Kelsey Schubert [ 06/Sep/16 ]

Hi edikmkoyan,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Thomas

Comment by Daniel Pasette (Inactive) [ 20/Jul/16 ]

Looks like file limits are incorrect as you're getting this error:

2016-07-19T17:40:02.521+0400 E -        [NetworkInterfaceASIO-TaskExecutorPool-2-0] cannot open /dev/urandom Too many open files

Are you using systemd? If so, this could be the issue: SERVER-24885

Generated at Thu Feb 08 04:08:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.