Hi,
I'd like to report a segfault bug in MongoS, in MongoDB 5.0.28.
We've observed our MongoS all crashing within seconds of each other, with the following error logs:
{"t":
,"s":"F", "c":"CONTROL", "id":6384300, "ctx":"conn619096","msg":"Writing fatal message","attr":{"message":"Invalid access at address: 0x7\n"}}
{"t":
,"s":"F", "c":"CONTROL", "id":6384300, "ctx":"conn619096","msg":"Writing fatal message","attr":{"message":"Got signal: 11 (Segmentation fault).\n"}}
Interestingly, it's MongoS that crashes not MongoD (but that makes things even worse for us as they all crash, rendering the DB unavailable).
Here is the complete stacktrace:
Stacktrace
{ "t": { "$date": "2024-08-24T13:48:56.732+00:00" }, "s": "I", "c": "CONTROL", "id": 31380, "ctx": "conn619096", "msg": "BACKTRACE", "attr": { "bt": { "backtrace": [ { "a": "55C89BE2A9F3", "b": "55C899262000", "o": "2BC89F3", "s": "_ZN5mongo18stack_trace_detail12_GLOBAL__N_117getStackTraceImplERKNS1_7OptionsE.constprop.163", "s+": "213" }, { "a": "55C89BE2D447", "b": "55C899262000", "o": "2BCB447", "s": "_ZN5mongo15printStackTraceEv", "s+": "37" }, { "a": "55C89BE2573C", "b": "55C899262000", "o": "2BC373C", "s": "abruptQuitWithAddrSignal", "s+": "EC" }, { "a": "7F37B988A420", "b": "7F37B9876000", "o": "14420", "s": "funlockfile", "s+": "60" }, { "a": "55C89BC7ED7A", "b": "55C899262000", "o": "2A1CD7A", "s": "_ZN5mongo11ClockSource21waitForConditionUntilERNS_4stdx18condition_variableENS_20BasicLockableAdapterENS_6Date_tEPNS_8WaitableE", "s+": "B1A" }, { "a": "55C89BC72915", "b": "55C899262000", "o": "2A10915", "s": "_ZN5mongo16OperationContext40waitForConditionOrInterruptNoAssertUntilERNS_4stdx18condition_variableENS_20BasicLockableAdapterENS_6Date_tE", "s+": "265" }, { "a": "55C89BA72DE6", "b": "55C899262000", "o": "2810DE6", "s": "ZZN5mongo13Interruptible32waitForConditionOrInterruptUntilISt11unique_lockISt5mutexEZNS_12NotificationIbE7waitForEPNS_16OperationContextENS_8DurationISt5ratioILl1ELl1000EEEEEUlvE_EEbRNS_4stdx18condition_variableERT_NS_6Date_tET0_ENKUlSJ_NS0_9WakeSpeedEE1_clESJ_SL", "s+": "5A6" }, { "a": "55C89BA6F1F1", "b": "55C899262000", "o": "280D1F1", "s": "_ZN5mongo21KeysCollectionManager14PeriodicRunner10refreshNowEPNS_16OperationContextE", "s+": "371" }, { "a": "55C89BA7414F", "b": "55C899262000", "o": "281214F", "s": "_ZN5mongo20LogicalTimeValidator15signLogicalTimeEPNS_16OperationContextERKNS_11LogicalTimeE", "s+": "EF" }, { "a": "55C89BA78315", "b": "55C899262000", "o": "2816315", "s": "_ZNK5mongo11VectorClock21SignedComponentFormat3outEPNS_14ServiceContextEPNS_16OperationContextEbPNS_14BSONObjBuilderENS_11LogicalTimeENS0_9ComponentE", "s+": "95" }, { "a": "55C89BA76677", "b": "55C899262000", "o": "2814677", "s": "_ZNK5mongo11VectorClock19_gossipOutComponentEPNS_16OperationContextEPNS_14BSONObjBuilderERKNS0_14ComponentArrayINS_11LogicalTimeEEENS0_9ComponentE", "s+": "67" }, { "a": "55C89BA771E2", "b": "55C899262000", "o": "28151E2", "s": "_ZNK5mongo11VectorClock9gossipOutEPNS_16OperationContextEPNS_14BSONObjBuilderEj", "s+": "1B2" }, { "a": "55C89B43E572", "b": "55C899262000", "o": "21DC572", "s": "_ZN5mongo3rpc23VectorClockMetadataHook20writeRequestMetadataEPNS_16OperationContextEPNS_14BSONObjBuilderE", "s+": "32" }, { "a": "55C89B9B535C", "b": "55C899262000", "o": "275335C", "s": "_ZN5mongo3rpc22EgressMetadataHookList20writeRequestMetadataEPNS_16OperationContextEPNS_14BSONObjBuilderE", "s+": "5C" }, { "a": "55C89B5F22C8", "b": "55C899262000", "o": "23902C8", "s": "_ZN5mongo8executor18NetworkInterfaceTL12startCommandERKNS0_12TaskExecutor14CallbackHandleERNS0_24RemoteCommandRequestImplISt6vectorINS_11HostAndPortESaIS8_EEEEONS_15unique_functionIFvRKNS0_26RemoteCommandOnAnyResponseEEEERKSt10shared_ptrINS_5BatonEE", "s+": "1B8" }, { "a": "55C89B5CAB5D", "b": "55C899262000", "o": "2368B5D", "s": "_ZN5mongo8executor22ThreadPoolTaskExecutor26scheduleRemoteCommandOnAnyERKNS0_24RemoteCommandRequestImplISt6vectorINS_11HostAndPortESaIS4_EEEERKSt8functionIFvRKNS0_12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEERKSt10shared_ptrINS_5BatonEE", "s+": "45D" }, { "a": "55C89A2144B1", "b": "55C899262000", "o": "FB24B1", "s": "_ZN5mongo8executor20ShardingTaskExecutor26scheduleRemoteCommandOnAnyERKNS0_24RemoteCommandRequestImplISt6vectorINS_11HostAndPortESaIS4_EEEERKSt8functionIFvRKNS0_12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEERKSt10shared_ptrINS_5BatonEE", "s+": "4E1" }, { "a": "55C89A5D7F47", "b": "55C899262000", "o": "1375F47", "s": "ZN5mongo8executor18ScopedTaskExecutor4Impl13_wrapCallbackIZNS2_26scheduleRemoteCommandOnAnyERKNS0_24RemoteCommandRequestImplISt6vectorINS_11HostAndPortESaIS6_EEEERKSt8functionIFvRKNS0_12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEERKSt10shared_ptrINS_5BatonEEEUlOT_E_SK_EENS_10StatusWithINSD_14CallbackHandleEEESR_OT0", "s+": "5C7" }, { "a": "55C89A5D82A0", "b": "55C899262000", "o": "13762A0", "s": "_ZN5mongo8executor18ScopedTaskExecutor4Impl26scheduleRemoteCommandOnAnyERKNS0_24RemoteCommandRequestImplISt6vectorINS_11HostAndPortESaIS5_EEEERKSt8functionIFvRKNS0_12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEERKSt10shared_ptrINS_5BatonEE", "s+": "30" }, { "a": "55C89A5B60E3", "b": "55C899262000", "o": "13540E3", "s": "_ZN5mongo19AsyncRequestsSender10RemoteData21scheduleRemoteCommandEOSt6vectorINS_11HostAndPortESaIS3_EE", "s+": "1D3" }, { "a": "55C89A5BFC55", "b": "55C899262000", "o": "135DC55", "s": "ZZNO5mongo14future_details10FutureImplINS0_8FakeVoidEE4thenIZZZNS_14ExecutorFutureISt6vectorINS_11HostAndPortESaIS7_EEE12wrapCBHelperIFNS_10SemiFutureINS_8executor12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEES9_EEEDaONS_15unique_functionIT_EEENUlDpOT_E_clIJS9_EEENS_6FutureINS0_17UnwrappedTypeImplIDTclfp_spcl7forwardIDtfp_EEfp_EEEE4typeEEESO_ENUlNS_6StatusEE_clESY_EUlvE_EEDaOSJ_ENKUlOS2_E_clES12", "s+": "E5" }, { "a": "55C89A5C085E", "b": "55C899262000", "o": "135E85E", "s": "ZN5mongo7PromiseINS_8executor12TaskExecutor30RemoteCommandOnAnyCallbackArgsEE7setWithIZZZNS_14ExecutorFutureISt6vectorINS_11HostAndPortESaIS8_EEE12wrapCBHelperIFNS_10SemiFutureIS3_EESA_EEEDaONS_15unique_functionIT_EEENUlDpOT_E_clIJSA_EEENS_6FutureINS_14future_details17UnwrappedTypeImplIDTclfp_spcl7forwardIDtfp_EEfp_EEEE4typeEEESM_ENUlNS_6StatusEE_clESX_EUlvE_Li0EEEvOSH", "s+": "7E" }, { "a": "55C89A5C09D8", "b": "55C899262000", "o": "135E9D8", "s": "ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZZNS_14ExecutorFutureISt6vectorINS_11HostAndPortESaIS7_EEE12wrapCBHelperIFNS_10SemiFutureINS_8executor12TaskExecutor30RemoteCommandOnAnyCallbackArgsEEES9_EEEDaONS0_IT_EEENUlDpOT_E_clIJS9_EEENS_6FutureINS_14future_details17UnwrappedTypeImplIDTclfp_spcl7forwardIDtfp_EEfp_EEEE4typeEEESN_EUlS1_E_EEDaOSI_EN12SpecificImpl4callEOS1", "s+": "98" }, { "a": "55C89BC685B4", "b": "55C899262000", "o": "2A065B4", "s": "ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_12_GLOBAL__N_18SubBaton8scheduleES3_EUlS1_E_EEDaOT_EN12SpecificImpl4callEOS1", "s+": "104" }, { "a": "55C89B61CCDD", "b": "55C899262000", "o": "23BACDD", "s": "ZZN5mongo15unique_functionIFvSt11unique_lockISt5mutexEEE8makeImplIZNS_9transport18TransportLayerASIO9BatonASIO8scheduleENS0_IFvNS_6StatusEEEEEUlS3_E_EEDaOT_EN12SpecificImpl4callEOS3", "s+": "8D" }, { "a": "55C89B62363E", "b": "55C899262000", "o": "23C163E", "s": "_ZZN5mongo9transport18TransportLayerASIO9BatonASIO3runEPNS_11ClockSourceEENKUlvE_clEv", "s+": "11E" }, { "a": "55C89B625522", "b": "55C899262000", "o": "23C3522", "s": "_ZN5mongo9transport18TransportLayerASIO9BatonASIO3runEPNS_11ClockSourceE", "s+": "732" }, { "a": "55C89B62ABBD", "b": "55C899262000", "o": "23C8BBD", "s": "_ZN5mongo9transport18TransportLayerASIO9BatonASIO9run_untilEPNS_11ClockSourceENS_6Date_tE", "s+": "5D" }, { "a": "55C89BC7E3A1", "b": "55C899262000", "o": "2A1C3A1", "s": "_ZN5mongo11ClockSource21waitForConditionUntilERNS_4stdx18condition_variableENS_20BasicLockableAdapterENS_6Date_tEPNS_8WaitableE", "s+": "141" }, { "a": "55C89BC72915", "b": "55C899262000", "o": "2A10915", "s": "_ZN5mongo16OperationContext40waitForConditionOrInterruptNoAssertUntilERNS_4stdx18condition_variableENS_20BasicLockableAdapterENS_6Date_tE", "s+": "265" }, { "a": "55C89A5BE09E", "b": "55C899262000", "o": "135C09E", "s": "ZZN5mongo13Interruptible32waitForConditionOrInterruptUntilISt11unique_lockISt5mutexEZNS_30producer_consumer_queue_detail21ProducerConsumerQueueINS_19AsyncRequestsSender8ResponseELNS5_12ProducerKindE0ELNS5_12ConsumerKindE0ENS5_19DefaultCostFunctionEE16_waitForNonEmptyERS4_PS0_EUlvE_EEbRNS_4stdx18condition_variableERT_NS_6Date_tET0_ENKUlSL_NS0_9WakeSpeedEE1_clESL_SN", "s+": "26E" }, { "a": "55C89A5BE29F", "b": "55C899262000", "o": "135C29F", "s": "ZZN5mongo13Interruptible32waitForConditionOrInterruptUntilISt11unique_lockISt5mutexEZNS_30producer_consumer_queue_detail21ProducerConsumerQueueINS_19AsyncRequestsSender8ResponseELNS5_12ProducerKindE0ELNS5_12ConsumerKindE0ENS5_19DefaultCostFunctionEE16_waitForNonEmptyERS4_PS0_EUlvE_EEbRNS_4stdx18condition_variableERT_NS_6Date_tET0_ENKUlSL_NS0_9WakeSpeedEE2_clESL_SN", "s+": "AF" }, { "a": "55C89A5C1768", "b": "55C899262000", "o": "135F768", "s": "ZN5mongo30producer_consumer_queue_detail21ProducerConsumerQueueINS_19AsyncRequestsSender8ResponseELNS0_12ProducerKindE0ELNS0_12ConsumerKindE0ENS0_19DefaultCostFunctionEE10_popRunnerIZNS7_3popEPNS_13InterruptibleEEUlRSt11unique_lockISt5mutexEE_EEDaOT", "s+": "1C8" }, { "a": "55C89A5B95C9", "b": "55C899262000", "o": "13575C9", "s": "_ZN5mongo19AsyncRequestsSender4nextEv", "s+": "119" }, { "a": "55C89A571D69", "b": "55C899262000", "o": "130FD69", "s": "_ZN5mongo39MultiStatementTransactionRequestsSender4nextEv", "s+": "29" }, { "a": "55C89A54CC28", "b": "55C899262000", "o": "12EAC28", "s": "_ZN5mongo16establishCursorsEPNS_16OperationContextESt10shared_ptrINS_8executor12TaskExecutorEERKNS_15NamespaceStringENS_21ReadPreferenceSettingERKSt6vectorISt4pairINS_7ShardIdENS_7BSONObjEESaISE_EEbNS_5Shard11RetryPolicyE", "s+": "898" }, { "a": "55C89A52EE25", "b": "55C899262000", "o": "12CCE25", "s": "_ZN5mongo11ClusterFind8runQueryEPNS_16OperationContextERKNS_14CanonicalQueryERKNS_21ReadPreferenceSettingEPSt6vectorINS_7BSONObjESaISA_EEPb", "s+": "FA5" }, { "a": "55C89A282D07", "b": "55C899262000", "o": "1020D07", "s": "_ZN5mongo12_GLOBAL__N_114ClusterFindCmd10Invocation3runEPNS_16OperationContextEPNS_3rpc21ReplyBuilderInterfaceE", "s+": "217" }, { "a": "55C89B4DF01F", "b": "55C899262000", "o": "227D01F", "s": "_ZN5mongo14CommandHelpers20runCommandInvocationEPNS_16OperationContextERKNS_12OpMsgRequestEPNS_17CommandInvocationEPNS_3rpc21ReplyBuilderInterfaceE", "s+": "7F" }, { "a": "55C89B4E4F1E", "b": "55C899262000", "o": "2282F1E", "s": "_ZN5mongo14CommandHelpers20runCommandInvocationESt10shared_ptrINS_23RequestExecutionContextEES1_INS_17CommandInvocationEENS_9transport15ServiceExecutor14ThreadingModelE", "s+": "1BE" }, { "a": "55C89A306067", "b": "55C899262000", "o": "10A4067", "s": "_ZN5mongo12_GLOBAL__N_120runCommandInvocationESt10shared_ptrINS_23RequestExecutionContextEES1_INS_17CommandInvocationEE", "s+": "97" }, { "a": "55C89A30FA4D", "b": "55C899262000", "o": "10ADA4D", "s": "_ZN5mongo12_GLOBAL__N_117ExecCommandClient3runEv", "s+": "4AD" }, { "a": "55C89A31049F", "b": "55C899262000", "o": "10AE49F", "s": "_ZN5mongo12_GLOBAL__N_118ParseAndRunCommand11RunAndRetry3runEv", "s+": "3AF" }, { "a": "55C89A311C61", "b": "55C899262000", "o": "10AFC61", "s": "_ZN5mongo12_GLOBAL__N_118ParseAndRunCommand3runEv", "s+": "B1" }, { "a": "55C89A312901", "b": "55C899262000", "o": "10B0901", "s": "_ZN5mongo13ClientCommand8_executeEv", "s+": "321" }, { "a": "55C89A313363", "b": "55C899262000", "o": "10B1363", "s": "_ZN5mongo13ClientCommand3runEv", "s+": "43" }, { "a": "55C89A313DEC", "b": "55C899262000", "o": "10B1DEC", "s": "ZN5mongo19makeReadyFutureWithIZNOS_11future_util10AsyncStateINS_13ClientCommandEE13thenWithStateIZNS_8Strategy13clientCommandESt10shared_ptrINS_23RequestExecutionContextEEEUlPT_E_EEDaOSA_EUlvE_Li0EEENS_6FutureINS_14future_details17UnwrappedTypeImplINSt13invoke_resultISD_JEE4typeEE4typeEEESD", "s+": "3C" }, { "a": "55C89A314BC4", "b": "55C899262000", "o": "10B2BC4", "s": "_ZN5mongo8Strategy13clientCommandESt10shared_ptrINS_23RequestExecutionContextEE", "s+": "184" }, { "a": "55C89A1B3373", "b": "55C899262000", "o": "F51373", "s": "_ZN5mongo13HandleRequest13handleRequestEv", "s+": "B3" }, { "a": "55C89A1B3820", "b": "55C899262000", "o": "F51820", "s": "ZZN5mongo15unique_functionIFvPNS_14future_details15SharedStateBaseEEE8makeImplIZNS1_10FutureImplINS1_8FakeVoidEE16makeContinuationINS_10DbResponseEZZNOS9_4thenIZNS_13HandleRequest3runEvEUlvE0_EEDaOT_ENKUlvE1_clEvEUlPNS1_15SharedStateImplIS8_EEPNSI_ISB_EEE_EENS7_ISF_EEOT0_EUlS3_E_EEDaSG_EN12SpecificImpl4callEOS3", "s+": "A0" }, { "a": "55C89A1B6B47", "b": "55C899262000", "o": "F54B47", "s": "_ZN5mongo14future_details15SharedStateBase20transitionToFinishedEv", "s+": "147" }, { "a": "55C89A1B6B47", "b": "55C899262000", "o": "F54B47", "s": "_ZN5mongo14future_details15SharedStateBase20transitionToFinishedEv", "s+": "147" }, { "a": "55C89A1B4A11", "b": "55C899262000", "o": "F52A11", "s": "_ZN5mongo13HandleRequest3runEv", "s+": "791" }, { "a": "55C89A1B5B44", "b": "55C899262000", "o": "F53B44", "s": "_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE", "s+": "214" }, { "a": "55C89A1F8477", "b": "55C899262000", "o": "F96477", "s": "_ZN5mongo9transport19ServiceStateMachine4Impl14processMessageEv", "s+": "127" }, { "a": "55C89A1F8D67", "b": "55C899262000", "o": "F96D67", "s": "ZZNO5mongo14future_details10FutureImplINS0_8FakeVoidEE4thenIZNS_9transport19ServiceStateMachine4Impl12startNewLoopERKNS_6StatusEEUlvE0_EEDaOT_ENKUlOS2_E_clESE.isra.721", "s+": "27" }, { "a": "55C89A1FBD3A", "b": "55C899262000", "o": "F99D3A", "s": "_ZN5mongo9transport19ServiceStateMachine4Impl12startNewLoopERKNS_6StatusE", "s+": "2EA" }, { "a": "55C89A1FC25F", "b": "55C899262000", "o": "F9A25F", "s": "ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_9transport19ServiceStateMachine4Impl15scheduleNewLoopES1_EUlS1_E_EEDaOT_EN12SpecificImpl4callEOS1", "s+": "7F" }, { "a": "55C89B608881", "b": "55C899262000", "o": "23A6881", "s": "ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_9transport26ServiceExecutorSynchronous18runOnDataAvailableERKSt10shared_ptrINS5_7SessionEES3_EUlS1_E_EEDaOT_EN12SpecificImpl4callEOS1", "s+": "41" }, { "a": "55C89B603073", "b": "55C899262000", "o": "23A1073", "s": "_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_9transport15ServiceExecutor8scheduleENS0_IFvNS_6StatusEEEEEUlvE_EEDaOT_EN12SpecificImpl4callEv", "s+": "33" }, { "a": "55C89B608C6B", "b": "55C899262000", "o": "23A6C6B", "s": "_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_9transport26ServiceExecutorSynchronous12scheduleTaskES2_NS4_15ServiceExecutor13ScheduleFlagsEEUlvE0_EEDaOT_EN12SpecificImpl4callEv", "s+": "BB" }, { "a": "55C89B60BECC", "b": "55C899262000", "o": "23A9ECC", "s": "_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_25launchServiceWorkerThreadES2_EUlvE2_EEDaOT_EN12SpecificImpl4callEv", "s+": "5C" }, { "a": "55C89B60BF3C", "b": "55C899262000", "o": "23A9F3C", "s": "_ZN5mongo12_GLOBAL__N_17runFuncEPv", "s+": "1C" }, { "a": "7F37B987E609", "b": "7F37B9876000", "o": "8609", "s": "start_thread", "s+": "D9" }, { "a": "7F37B97A3353", "b": "7F37B9684000", "o": "11F353", "s": "clone", "s+": "43" } ], "processInfo": { "mongodbVersion": "5.0.28", "gitVersion": "a8f8b8e1e271f236e761d0138e2418d0a114c941", "compiledModules": , "uname": { "sysname": "Linux", "release": "5.15.0-1062-gcp", "version": "#70~20.04.1-Ubuntu SMP Fri May 24 20:12:18 UTC 2024", "machine": "x86_64" }, "somap": [ { "b": "55C899262000", "elfType": 3, "buildId": "321DACE850546A7881C1A5F5DCD3DB66C5ADF271" }, { "b": "7F37B9876000", "path": "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType": 3, "buildId": "9A65BB469E45A1C6FBCFFAE5B82A2FD7A69EB479" } ] } } } }
Thanks!