-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Stability
-
None
-
ALL
-
(copied to CRM)
We hit a segmentation vault on version 3.6.6 today on our primary. I'll give some additional context which may or may not be relevant to the error.
Recently we have been running up against the connection limit due to a OS limit we haven't lifted yet (~32000), which was causing new connections to fail. This was happening and we made a change to reduce the number of connections to the primary (~18000), which was working and then the primary seg faulted.
We have three replicas in this cluster, and one replica was already down for another issue (certificate expiration). I assume this prevented re-election causing our cluster to be unavailable until we manually cycled this replica by restarting the process.
This issue is around the segfault error itself, but I am also curious why the process did not crash itself (it was left hanging with a single core pegged–not sure what it was doing) which would have auto-restarted in our system.
2018-10-17T18:55:43.278+0000 F - [listener] Got signal: 11 (Segmentation fault).
0x56036294d8b1 0x56036294cac9 0x56036294d136 0x7f9764421390 0x7f9764417e8f 0x5603628016db 0x56036223c9ff 0x56036132709f 0x56036132789a 0x5603613256f1 0x5603624853b2 0x5603624915c9 0x560362491811 0x56036249ba5e 0x56036248377e 0x560362a5ce80 0x7f97644176ba 0x7f976414d41d
----- BEGIN BACKTRACE -----
,{"b":"5603606FA000","o":"2252AC9"},{"b":"5603606FA000","o":"2253136"},{"b":"7F9764410000","o":"11390"},{"b":"7F9764410000","o":"7E8F","s":"pthread_create"},{"b":"5603606FA000","o":"21076DB","s":"_ZN5mongo25launchServiceWorkerThreadESt8functionIFvvEE"},{"b":"5603606FA000","o":"1B429FF","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE"},{"b":"5603606FA000","o":"C2D09F","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE"},{"b":"5603606FA000","o":"C2D89A","s":"_ZN5mongo19ServiceStateMachine5startENS0_9OwnershipE"},{"b":"5603606FA000","o":"C2B6F1","s":"_ZN5mongo21ServiceEntryPointImpl12startSessionESt10shared_ptrINS_9transport7SessionEE"},{"b":"5603606FA000","o":"1D8B3B2"},{"b":"5603606FA000","o":"1D975C9","s":"_ZN4asio6detail9scheduler10do_run_oneERNS0_27conditionally_enabled_mutex11scoped_lockERNS0_21scheduler_thread_infoERKSt10error_code"},{"b":"5603606FA000","o":"1D97811","s":"_ZN4asio6detail9scheduler3runERSt10error_code"},{"b":"5603606FA000","o":"1DA1A5E","s":"_ZN4asio10io_context3runEv"},{"b":"5603606FA000","o":"1D8977E"},{"b":"5603606FA000","o":"2362E80"},{"b":"7F9764410000","o":"76BA"},{"b":"7F9764046000","o":"10741D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.6.6", "gitVersion" : "6405d65b1d6432e138b44c13085d0c2fe235d6bd", "compiledModules" : [], "uname" :
{ "sysname" : "Linux", "release" : "4.4.0-1049-aws", "version" : "#58-Ubuntu SMP Fri Jan 12 23:17:09 UTC 2018", "machine" : "x86_64" }, "somap" : [ { "b" : "5603606FA000", "elfType" : 3, "buildId" : "F63278FD698B5843222FE7A6C8FF17D6AEFBBE38" }, { "b" : "7FFF7935A000", "elfType" : 3, "buildId" : "3A8AFEDA6CA80FBF2589D7E5803A58BA8F13FE62" }, { "b" : "7F9765605000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "6EF73266978476EF9F2FD2CF31E57F4597CB74F8" }, { "b" : "7F97651C1000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "250E875F74377DFC74DE48BF80CCB237BB4EFF1D" }, { "b" : "7F9764F58000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "513282AC7EB386E2C0133FD9E1B6B8A0F38B047D" }, { "b" : "7F9764D54000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "8CC8D0D119B142D839800BFF71FB71E73AEA7BD4" }, { "b" : "7F9764B4C000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "89C34D7A182387D76D5CDA1F7718F5D58824DFB3" }, { "b" : "7F9764843000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "DFB85DE42DAFFD09640C8FE377D572DE3E168920" }, { "b" : "7F976462D000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }, { "b" : "7F9764410000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "CE17E023542265FC11D9BC8F534BB4F070493D30" }, { "b" : "7F9764046000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "B5381A457906D279073822A5CEB24C4BFEF94DDB" }, { "b" : "7F9765820000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5D7B6259552275A3C17BD4C3FD05F5A6BF40CAA5" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x56036294d8b1]
mongod(+0x2252AC9) [0x56036294cac9]
mongod(+0x2253136) [0x56036294d136]
libpthread.so.0(+0x11390) [0x7f9764421390]
libpthread.so.0(pthread_create+0x4FF) [0x7f9764417e8f]
mongod(_ZN5mongo25launchServiceWorkerThreadESt8functionIFvvEE+0xDB) [0x5603628016db]
mongod(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE+0x2FF) [0x56036223c9ff]
mongod(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE+0x15F) [0x56036132709f]
mongod(_ZN5mongo19ServiceStateMachine5startENS0_9OwnershipE+0x13A) [0x56036132789a]
mongod(_ZN5mongo21ServiceEntryPointImpl12startSessionESt10shared_ptrINS_9transport7SessionEE+0x881) [0x5603613256f1]
mongod(+0x1D8B3B2) [0x5603624853b2]
mongod(_ZN4asio6detail9scheduler10do_run_oneERNS0_27conditionally_enabled_mutex11scoped_lockERNS0_21scheduler_thread_infoERKSt10error_code+0x389) [0x5603624915c9]
mongod(_ZN4asio6detail9scheduler3runERSt10error_code+0xD1) [0x560362491811]
mongod(_ZN4asio10io_context3runEv+0x3E) [0x56036249ba5e]
mongod(+0x1D8977E) [0x56036248377e]
mongod(+0x2362E80) [0x560362a5ce80]
libpthread.so.0(+0x76BA) [0x7f97644176ba]
libc.so.6(clone+0x6D) [0x7f976414d41d]
----- END BACKTRACE -----