[SERVER-65252] Mongod process getting crashed with Got signal: 11 (Segmentation fault) Created: 05/Apr/22  Updated: 06/Dec/22

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: 4.4.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: VJ Shree Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File image-2022-04-06-16-15-54-515.png     File logs.tar.gz     Text File logs_beforeCrash.txt    
Assigned Teams:
Query Optimization
Operating System: ALL
Sprint: QE 2022-05-02, QO 2022-05-16, QO 2022-05-30, QO 2022-06-13, QO 2022-06-27, QO 2022-07-11, QO 2022-07-25, QO 2022-08-08, QO 2022-08-22, QO 2022-09-05
Participants:

 Description   

Mongod process crashed many time with Got signal: 11 (Segmentation fault) error

Version 4.2

2022-03-28T04:29:11.381+0100 I  COMMAND  [conn1079680] warning: log line attempted (31kB) over max size (10kB), printing beginning and end ... command koredbm001.$cmd command: update \{ update: "botsessions", updates: [ { q: { _id: { $in: [ ObjectId('62412a38ea64da00a1e13a76'), ObjectId('624124f37b7f9500d42c557a'), ObjectId('6241259c8daef100d39c3978'), ObjectId('624127a2a10bd100a2c92bcd'), ObjectId('6241277d391b7c00c11fc66b'), ObjectId('624112df75791b00c18a96ca'), ObjectId('6241239f75791b00c18a96cd'), ObjectId('624126af8daef100d39c3979'), ObjectId('624126a3f2a11800a2c909dd'), ObjectId('6241155ef02c2100c127821a'), ObjectId('624124d0a10bd100a2c92bcc'), ObjectId('624123a2a10bd100a2c92bca'), ObjectId('62411ea7e2456800a29c849d'), ObjectId('62411e66b3b13800c038c609'), ObjectId('624121d648c45900d4d68538'), ObjectId('6241150bea64da00a1e13a71'), ObjectId('62411c26f2a11800a2c909da'), ObjectId('62411aea48c45900d4d68535'), ObjectId('62411771b3b13800c038c607'), ObjectId('62410d307b7f9500d42c5574'), ObjectId('624110df391b7c00c11fc665'), ObjectId('62411b6b391b7c00c11fc668'), ObjectId('62411dafea64da00a1e13a74'), ObjectId('62410fcd48c45900d4d68530'), ObjectId('62410e870e467600d4306200'), ObjectId('6240fdd67b7f9500d42c5572'), ObjectId('6240fdc7f2a11800a2c909d3'), ObjectId('62409a9db3b13800c038c5fb'), ObjectId('624069e2f2a11800a2c909ce'), ObjectId('62405eb6b3b13800c038c5f7'), ObjectId('62405748b3b13800c038c5f6'), ObjectId('6240519048c45900d4d68521'), ObjectId('62403f51e2456800a29c848a'), ObjectId('62403dee0e467600d43061f3'), ObjectId('62402174e2456800a29c8488'), ObjectId('62401fc348c45900d4d6851e'), ObjectId('6240170f48c45900d4d6851d'), ObjectId('6240138b8daef100d39c395e'), ObjectId('624008e78daef100d39c395b'), ObjectId('623fe5c375791b00c18a96b8'), ObjectId('623f528b391b7c00c11fc64e'), ObjectId('623f5185f2a11800a2c909c1'), ObjectId('623f3b03391b7c00c11fc64c'), ObjectId('623f38477b7f9500d42c555c'), ObjectId('623f078e7b7f9500d42c5555'), ObjectId('623efb4be2456800a29c8475'), ObjectId('623ef4afea64da00a1e13a4d'), ObjectId('623eed9f8daef100d39c3951'), ObjectId('623ee1d37b7f9500d42c5552'), ObjectId('623ed9980e467600d43061d9'), ObjectId('623ed67f7b7f9500d42c5550'), ObjectId('623ea90f75791b00c18a96a3'), ObjectId('623e8d6075791b00c18a96a0'), ObjectId('623e722d391b7c00c11fc63c'), ObjectId('623e59b975791b00c18a9699'), ObjectId('623e52cb75791b00c18a9698'), ObjectId('623e3f928daef100d39c3942'), ObjectId('623e3c43f02c2100c12781e8'), ObjectId('623e3abee2456800a29c8467'), ObjectId('623e25c90e467600d43061c8'), ObjectId('623e0a1bb3b13800c038c5c7'), ObjectId('623e02dc48c45900d4d684f3'), ObjectId('623e0170b3b13800c038c5c6'), ObjectId('623dfc0e75791b00c18a968c'), ObjectId('623dfb838daef100d39c3937'), ObjectId('623df7fb8daef100d39c3936'), ObjectId('623df7df7b7f9500d42c553d'), ObjectId('623df0270e467600d43061bd'), ObjectId('623dee52a10bd100a2c92b86'), ObjectId('623dee49f02c2100c12781d9'), ObjectId('623dee190e467600d43061bb'), ObjectId('623dec41f2a11800a2c90998'), ObjectId('623de86bea64da00a1e13a30'), ObjectId('623de3d9f02c2100c12781d1'), ObjectId('623de0f6f2a11800a2c90993'), ObjectId('623de0d0b3b13800c038c5b9'), ObjectId('623ddf97f2a11800a2c90992'), ObjectId('623ddf33f2a11800a2c90990'), ObjectId('623ddcba391b7c00c11fc625'), ObjectId('623ddd300e467600d43061b6'), ObjectId('623ddc62f2a11800a2c9098f'), ObjectId('623dda70ea64da00a1e13a26'), ObjectId('623dd81d391b7c00c11fc621'), ObjectId('623dd20975791b00c18a9674'), ObjectId('623dd0d6e2456800a29c8446'), ObjectId('623dcfcfb3b13800c038c5ab'), ObjectId('623dcf470e467600d43061ae'), ObjectId(' .......... 526'), ObjectId('623813faea64da00a1e13524'), ObjectId('62381369e2456800a29c7f47'), ObjectId('62381209b3b13800c038c093'), ObjectId('623811d58daef100d39c341f'), ObjectId('623811c4391b7c00c11fc0f9'), ObjectId('6238116df02c2100c1277c98'), ObjectId('62381092391b7c00c11fc0f7'), ObjectId('62380fa8ea64da00a1e1351c'), ObjectId('62380e68f02c2100c1277c94'), ObjectId('62380d0375791b00c18a9160'), ObjectId('62380ce6b3b13800c038c08c'), ObjectId('62380cdb75791b00c18a915f'), ObjectId('62380bd975791b00c18a915d'), ObjectId('62380b21f2a11800a2c90472'), ObjectId('62380b8ae2456800a29c7f3d'), ObjectId('6238087e7b7f9500d42c5020'), ObjectId('623807fab3b13800c038c089'), ObjectId('623807f40e467600d4305c96'), ObjectId('6238072e7b7f9500d42c501f'), ObjectId('62380663f02c2100c1277c8d'), ObjectId('62380596f02c2100c1277c8c'), ObjectId('623804748daef100d39c3415'), ObjectId('6238030e8daef100d39c3414'), ObjectId('62380250b3b13800c038c084'), ObjectId('6238022e8daef100d39c3412'), ObjectId('62380023f02c2100c1277c8a'), ObjectId('6237fd6b0e467600d4305c91'), ObjectId('6237fe0a0e467600d4305c92'), ObjectId('6237fc97b3b13800c038c081'), ObjectId('6237fba0b3b13800c038c080'), ObjectId('6237fb26f02c2100c1277c89'), ObjectId('6237fa770e467600d4305c8f'), ObjectId('6237f96fa10bd100a2c92654'), ObjectId('6237f8ab48c45900d4d67fc4'), ObjectId('6237f4eaf2a11800a2c90464'), ObjectId('6237f51ae2456800a29c7f32'), ObjectId('6237f50248c45900d4d67fc3'), ObjectId('6237f36c75791b00c18a9151'), ObjectId('6237f2e3f02c2100c1277c84'), ObjectId('6237f1c08daef100d39c340c'), ObjectId('6237f10e75791b00c18a914f'), ObjectId('6237f08fb3b13800c038c07d'), ObjectId('6237ee23ea64da00a1e1350c'), ObjectId('6237ed4f8daef100d39c3408'), ObjectId('6237ec73ea64da00a1e1350a'), ObjectId('6237eb49ea64da00a1e13509'), ObjectId('6237eae4391b7c00c11fc0dc'), ObjectId('6237e93ef02c2100c1277c81'), ObjectId('6237e73e48c45900d4d67fbc'), ObjectId('6237e6cbf2a11800a2c9045f'), ObjectId('6237e4c1a10bd100a2c9264d'), ObjectId('6237e3e0391b7c00c11fc0da'), ObjectId('6237e28ff02c2100c1277c7c'), ObjectId('6237e2398daef100d39c3406'), ObjectId('6237da09391b7c00c11fc0d7'), ObjectId('6237ca5048c45900d4d67fb2'), ObjectId('62378690f02c2100c1277c74'), ObjectId('623776a68daef100d39c33fd'), ObjectId('62376e298daef100d39c33fc'), ObjectId('623752fd7b7f9500d42c5004'), ObjectId('62374f32ea64da00a1e134f8'), ObjectId('62374d6af2a11800a2c9044e'), ObjectId('623732cea10bd100a2c9263f'), ObjectId('62371ce6b3b13800c038c06a'), ObjectId('62370b5eb3b13800c038c066'), ObjectId('6236caab391b7c00c11fc0ca'), ObjectId('62369328f02c2100c1277c67'), ObjectId('62368b517b7f9500d42c4ff3'), ObjectId('62368acdf2a11800a2c9043e'), ObjectId('623689c9b3b13800c038c05c') ] } }, u: \{ $set: { sessionState: 0 } }, upsert: false, multi: true } ], ordered: true, writeConcern: \{ w: 1 }, lsid: \{ id: UUID("d8903daa-40dd-4f1c-8f01-21552e0433a0") }, $clusterTime: \{ clusterTime: Timestamp(1648438150, 29), signature: { hash: BinData(0, CCB9B6FDA9D746C14D3EBCE943635D100C197265), keyId: 7039263725253033985 } }, $db: "koredbm001" } numYields:20 reslen:245 locks:\{ ParallelBatchWriterMode: { acquireCount: { r: 22 } }, ReplicationStateTransition: \{ acquireCount: { w: 22 } }, Global: \{ acquireCount: { r: 1, w: 21 } }, Database: \{ acquireCount: { w: 21 } }, Collection: \{ acquireCount: { w: 21 } }, Mutex: \{ acquireCount: { r: 828 } } } flowControl:\{ acquireCount: 21, timeAcquiringMicros: 48 } storage:{} protocol:op_msg 242ms
 
*2022-03-28T04:29:25.440+0100 F  -        [conn1079167] Invalid access at address: 0x55e7d6b96738*
 
*2022-03-28T04:29:25.509+0100 F  -        [conn1079167] Got signal: 11 (Segmentation fault).*
 
0x55e7d5c12361 0x55e7d5c1198c 0x55e7d5c11b70 0x7f6d8d326630 0x7f6d8d0a03ff 0x55e7d40b2046 0x55e7d57509ac 0x55e7d5750e4a 0x55e7d5750f14 0x55e7d575109b 0x55e7d5752e1b 0x55e7d5752e82 0x55e7d4b03f71 0x55e7d4b08847 0x55e7d47bd861 0x55e7d47bec3a 0x55e7d55f70af 0x55e7d44a9caa 0x55e7d44aaf95 0x55e7d44acf14 0x55e7d44adc7a 0x55e7d449a70c 0x55e7d44a77ac 0x55e7d44a4b4f 0x55e7d44a671c 0x55e7d5327d02 0x55e7d44a1bad 0x55e7d44a33ab 0x55e7d44a3e46 0x55e7d44a4aab 0x55e7d44a671c 0x55e7d532816b 0x55e7d5982d05 0x55e7d5982d64 0x7f6d8d31eea5 0x7f6d8d047b0d

BACKTRACE

----- BEGIN BACKTRACE -----
 
{"backtrace":[\{"b":"55E7D32FE000","o":"2914361","s":"_ZN5mongo15printStackTraceERSo"},\{"b":"55E7D32FE000","o":"291398C"},\{"b":"55E7D32FE000","o":"2913B70"},\{"b":"7F6D8D317000","o":"F630"},\{"b":"7F6D8CF49000","o":"1573FF"},\{"b":"55E7D32FE000","o":"DB4046","s":"_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPcEEvT_S7_St20forward_iterator_tag"},\{"b":"55E7D32FE000","o":"24529AC","s":"_ZN5mongo18PlanCacheIndexTree13setIndexEntryERKNS_10IndexEntryE"},\{"b":"55E7D32FE000","o":"2452E4A","s":"_ZNK5mongo18PlanCacheIndexTree5cloneEv"},\{"b":"55E7D32FE000","o":"2452F14","s":"_ZNK5mongo17SolutionCacheData5cloneEv"},\{"b":"55E7D32FE000","o":"245309B","s":"_ZN5mongo14CachedSolutionC2ERKNS_12PlanCacheKeyERKNS_14PlanCacheEntryE"},\{"b":"55E7D32FE000","o":"2454E1B","s":"_ZNK5mongo9PlanCache3getERKNS_12PlanCacheKeyE"},\{"b":"55E7D32FE000","o":"2454E82","s":"_ZNK5mongo9PlanCache21getCacheEntryIfActiveERKNS_12PlanCacheKeyE"},\{"b":"55E7D32FE000","o":"1805F71"},\{"b":"55E7D32FE000","o":"180A847","s":"_ZN5mongo17getExecutorUpdateEPNS_16OperationContextEPNS_7OpDebugEPNS_10CollectionEPNS_12ParsedUpdateE"},\{"b":"55E7D32FE000","o":"14BF861"},\{"b":"55E7D32FE000","o":"14C0C3A"},\{"b":"55E7D32FE000","o":"22F90AF","s":"_ZN5mongo12BasicCommand10Invocation3runEPNS_16OperationContextEPNS_3rpc21ReplyBuilderInterfaceE"},\{"b":"55E7D32FE000","o":"11ABCAA"},\{"b":"55E7D32FE000","o":"11ACF95"},\{"b":"55E7D32FE000","o":"11AEF14"},\{"b":"55E7D32FE000","o":"11AFC7A","s":"_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE"},\{"b":"55E7D32FE000","o":"119C70C","s":"_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE"},\{"b":"55E7D32FE000","o":"11A97AC","s":"_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE"},\{"b":"55E7D32FE000","o":"11A6B4F","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},\{"b":"55E7D32FE000","o":"11A871C"},\{"b":"55E7D32FE000","o":"2029D02","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE"},\{"b":"55E7D32FE000","o":"11A3BAD","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE"},\{"b":"55E7D32FE000","o":"11A53AB","s":"_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE"},\{"b":"55E7D32FE000","o":"11A5E46","s":"_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE"},\{"b":"55E7D32FE000","o":"11A6AAB","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},\{"b":"55E7D32FE000","o":"11A871C"},\{"b":"55E7D32FE000","o":"202A16B"},\{"b":"55E7D32FE000","o":"2684D05"},\{"b":"55E7D32FE000","o":"2684D64"},\{"b":"7F6D8D317000","o":"7EA5"},\{"b":"7F6D8CF49000","o":"FEB0D","s":"clone"}],"processInfo":{ "mongodbVersion" : "4.2.11", "gitVersion" : "ea38428f0c6742c7c2c7f677e73d79e17a2aab96", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-1160.45.1.el7.x86_64", "version" : "[#1|https://kore.zendesk.com/agent/tickets/1] SMP Fri Sep 24 10:17:16 UTC 2021", "machine" : "x86_64" }, "somap" : [ \{ "b" : "55E7D32FE000", "elfType" : 3, "buildId" : "5630DD35AEAAF9DD47E830FA1CB312954AEBF72C" }, \{ "b" : "7FFF7D4AF000", "elfType" : 3, "buildId" : "E03ADB976A1EEED0CF5B1883188234BFF1AB6F85" }, \{ "b" : "7F6D8E746000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "087A2C6E9C2E9F78B6F9ACBD4A69E3EAF0FF9FF1" }, \{ "b" : "7F6D8E52C000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "B45C711D26DDD9F612D7814CE83B427927C8BC65" }, \{ "b" : "7F6D8E0C9000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "3C23941965821DC3BF18E60868C98E26E62C2BB1" }, \{ "b" : "7F6D8DE57000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "FCF9ABA6A186EA1AF5E235B747F0BA7D83ABAEE3" }, \{ "b" : "7F6D8DC53000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "7F2E9CB0769D7E57BD669B485A74B537B63A57C4" }, \{ "b" : "7F6D8DA4B000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "3E44DF7055942478D052E40FDD1F5B7862B152B0" }, \{ "b" : "7F6D8D749000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "7615604EAF4A068DFAE5085444D15C0DEE93DFBD" }, \{ "b" : "7F6D8D533000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "EDF51350C7F71496149D064AA8B1441F786DF88A" }, \{ "b" : "7F6D8D317000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "E10CC8F2B932FC3DAEDA22F8DAC5EBB969524E5B" }, \{ "b" : "7F6D8CF49000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8983F1C29724D82B38816C8C05A2BCB34D3283CA" }, \{ "b" : "7F6D8E9B0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "62C449974331341BB08DCCE3859560A22AF1E172" }, \{ "b" : "7F6D8CD16000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "F4123103FB2318594448C44E47091DD68D1C78C0" }, \{ "b" : "7F6D8CAE9000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "2B2059A2C4A5C03706931D8FF0E56C42116C259C" }, \{ "b" : "7F6D8C886000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "AC64886BEDE635D0A6AA63836933A977898F12BC" }, \{ "b" : "7F6D8C65E000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "DD041F0CB95D33D44432EF2F6E9B8610A4743B0E" }, \{ "b" : "7F6D8C325000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "84F63CB6B4180B1C48980BF36F1022810A11B7E3" }, \{ "b" : "7F6D8C0F5000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "645C288BDFD0B9A7AF4B62FDFDCC13F813A65040" }, \{ "b" : "7F6D8BEF1000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "D2BFF4D7F5B7F3A7F11DFB36B24EFA3833727553" }, \{ "b" : "7F6D8BCEC000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "33FE665F335F345FC0034EB701DD4D88F392C1F4" }, \{ "b" : "7F6D8BAAE000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "E9E18F214AC88ABE75774EA57B72C56E7F053C90" }, \{ "b" : "7F6D8B861000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "AEA6B45BD6C1D2A780B5564D84C8769EA62F80D0" }, \{ "b" : "7F6D8B578000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "6AFDEFE4487A2D18F9B270C8FC933BF07C5E6B1C" }, \{ "b" : "7F6D8B345000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "B116323D8BD4087DBFE1837A24EA13845E8BC51D" }, \{ "b" : "7F6D8B141000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "E4C7298B74FEEADC4DDE40CDD8C4D6B85FE09ADE" }, \{ "b" : "7F6D8AF32000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "362E4B3B349ABC3FFF4B6A5D8C28417957F6815C" }, \{ "b" : "7F6D8ACDD000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "C8EBC92AA44DF92EF2CB3D7025AC6F102D646598" }, \{ "b" : "7F6D8AAC7000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "B04855870B0DE434F354DE3147230F2677200B56" }, \{ "b" : "7F6D8A8B7000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "AED31F16223CE52AE079AB1ED4C09AC4C98F86B8" }, \{ "b" : "7F6D8A6B3000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8CA73C16CFEB9A8B5660015B9223B09F87041CAD" }, \{ "b" : "7F6D8A496000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "9AF2AD92DADE046C6260DCCF02846BF78ABC658C" }, \{ "b" : "7F6D8A26F000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "805AB866A4573EFEC4D8EA95123E8349B2B9D349" }, \{ "b" : "7F6D8A038000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "97BE6F9199FED4491B00AA91F7E6EACC4D5328F7" }, \{ "b" : "7F6D89DD6000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "F5B144F9F5D9BE451C80211B34DB2CE348E039B6" }, \{ "b" : "7F6D89BD3000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "EE139293597891D8814160CDBFEEBF8AE3651608" } ] }}
 
mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x55e7d5c12361]
 
mongod(+0x291398C) [0x55e7d5c1198c]
 
mongod(+0x2913B70) [0x55e7d5c11b70]
 
libpthread.so.0(+0xF630) [0x7f6d8d326630]
 
libc.so.6(+0x1573FF) [0x7f6d8d0a03ff]
 
mongod(_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPcEEvT_S7_St20forward_iterator_tag+0xB6) [0x55e7d40b2046]
 
mongod(_ZN5mongo18PlanCacheIndexTree13setIndexEntryERKNS_10IndexEntryE+0x5C) [0x55e7d57509ac]
 
mongod(_ZNK5mongo18PlanCacheIndexTree5cloneEv+0x8A) [0x55e7d5750e4a]
 
mongod(_ZNK5mongo17SolutionCacheData5cloneEv+0x44) [0x55e7d5750f14]
 
mongod(_ZN5mongo14CachedSolutionC2ERKNS_12PlanCacheKeyERKNS_14PlanCacheEntryE+0x13B) [0x55e7d575109b]
 
mongod(_ZNK5mongo9PlanCache3getERKNS_12PlanCacheKeyE+0xDB) [0x55e7d5752e1b]
 
mongod(_ZNK5mongo9PlanCache21getCacheEntryIfActiveERKNS_12PlanCacheKeyE+0x32) [0x55e7d5752e82]
 
mongod(+0x1805F71) [0x55e7d4b03f71]
 
mongod(_ZN5mongo17getExecutorUpdateEPNS_16OperationContextEPNS_7OpDebugEPNS_10CollectionEPNS_12ParsedUpdateE+0x677) [0x55e7d4b08847]
 
mongod(+0x14BF861) [0x55e7d47bd861]
 
mongod(+0x14C0C3A) [0x55e7d47bec3a]
 
mongod(_ZN5mongo12BasicCommand10Invocation3runEPNS_16OperationContextEPNS_3rpc21ReplyBuilderInterfaceE+0xAF) [0x55e7d55f70af]
 
mongod(+0x11ABCAA) [0x55e7d44a9caa]
 
mongod(+0x11ACF95) [0x55e7d44aaf95]
 
mongod(+0x11AEF14) [0x55e7d44acf14]
 
mongod(_ZN5mongo23ServiceEntryPointCommon13handleRequestEPNS_16OperationContextERKNS_7MessageERKNS0_5HooksE+0x42A) [0x55e7d44adc7a]
 
mongod(_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x3C) [0x55e7d449a70c]
 
mongod(_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE+0xEC) [0x55e7d44a77ac]
 
mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x17F) [0x55e7d44a4b4f]
 
mongod(+0x11A871C) [0x55e7d44a671c]
 
mongod(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE+0x182) [0x55e7d5327d02]
 
mongod(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE+0x10D) [0x55e7d44a1bad]
 
mongod(_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE+0x6DB) [0x55e7d44a33ab]
 
mongod(_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE+0x316) [0x55e7d44a3e46]
 
mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0xDB) [0x55e7d44a4aab]
 
mongod(+0x11A871C) [0x55e7d44a671c]
 
mongod(+0x202A16B) [0x55e7d532816b]
 
mongod(+0x2684D05) [0x55e7d5982d05]
 
mongod(+0x2684D64) [0x55e7d5982d64]
 
libpthread.so.0(+0x7EA5) [0x7f6d8d31eea5]
 
libc.so.6(clone+0x6D) [0x7f6d8d047b0d]
 
-----  END BACKTRACE  -----

2022-03-28T07:29:10.621+0100 I  CONTROL  [main] ***** SERVER RESTARTED *****

 



 Comments   
Comment by Anton Korshunov [ 06/Sep/22 ]

dmitry.agranat@mongodb.com mail2vjshree@gmail.com Sorry for the long saga with this issue, but given that we haven't been actively looking into it since June, I thought I'd provide an update and advise on the best course of action. Unfortunately, we exhausted our abilities to troubleshoot this issue based on the provided details. To the best of our knowledge, this issue seems to be related to memory corruption or use-after-free error, which is very hard to diagnose without a repro, as the problem could have occurred some time earlier and the failing query is just a victim. There is a possibility that the issue have been fixed in later versions as it seems to be limited to v4.4.5 and we haven't observed it on any recent versions. If you want us to continue with the investigation we will need to collect more diagnostic info and try to come up with a repro script. Logs and coredumps would be of a great help if you could provide them, as well as FTDC data around the time of failure to better understand the workload. Meanwhile, I'm going to send this ticket back to the Query backlog until we have more information.

Comment by Ruoxin Xu [ 31/May/22 ]

Hi dmitry.agranat@mongodb.com! It seems that the backtrace in the comments is different from the one in Description. Are we facing two different issues? Do we have core dumps or any means to reproduce this problem? I was checking the FTDC metrics also but couldn’t find the metrics around the time the server crashed (2022-04-05T00:01:14.426). 

Comment by Kyle Suarez [ 13/May/22 ]

mail2vjshree@gmail.com, I apologize for the lack of an update here – the team has been filled to capacity with work for the 6.0 release and we have not yet had time to look at this. I am flagging this ticket for re-triage to ask if another Query Execution subteam can take on this investigation.

Comment by Kyle Suarez [ 22/Apr/22 ]

Hi mail2vjshree@gmail.com, we are looking at this ticket and will try to provide an update soon – thank you for your patience.

Comment by VJ Shree [ 18/Apr/22 ]

Hi Kyle

Could you please update the status for above request

Comment by VJ Shree [ 12/Apr/22 ]

Hi Dima

I have uploaded script output to the upload portal. Please help to confirm

Comment by VJ Shree [ 11/Apr/22 ]

Hi Dima

The event occurred on 5th April also. We are waiting for the script output will update ASAP

Comment by Dmitry Agranat [ 11/Apr/22 ]

Thanks mail2vjshree@gmail.com, the attached diagnostic.data is covering April 4th while the reported event happened on March 28th. Do you still have diagnostic.data covering March 28th? Also, did you have a chance to execute the script from my last comment?

Comment by VJ Shree [ 07/Apr/22 ]

Attached and uploaded logs (diagnostic.data,mesages and dmesg). Will run the scriot and send you the output

Thanks

Comment by Dmitry Agranat [ 07/Apr/22 ]

Thank you mail2vjshree@gmail.com for providing the log, did you have a chance to upload the rest of the information to the secure uploader?

  • the $dbpath/diagnostic.data directory (the contents are described here)
  • the full messeges and dmesg logs

We would also need to gather configuration and statistical information about this server. Please download the mdiag.sh script from our github repository. Please run this script on the server in question.

Note: This script should be run from the command line/OS shell with root/sudo privileges as follows:

sudo bash mdiag.sh SERVER-65252

It will take about 5 minutes to run. Once completed, please attach the resulting /tmp/mdiag-$HOSTNAME.txt to the ticket.

Comment by VJ Shree [ 06/Apr/22 ]

Hi Dima

 
Thanks for the quick reposne. this issue was occuring frequently hence we have upgraded to v4.4.5 version and after upgrade we are facing the segmentation fault error in 4.4 version aswell. Uploaded mongodb log to the provided drive

Below is the only event we saw in dmesg

 

Comment by Dmitry Agranat [ 06/Apr/22 ]

mail2vjshree@gmail.com,

I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time.

For the server in question that includes the incident, would you please archive (tar or zip) and upload to that link:

  • full the mongod logs
  • the $dbpath/diagnostic.data directory (the contents are described here)
  • messeges and dmesg logs

Was this a one-time occurrence or does this happen repeatedly?

Regards,
Dima

Generated at Thu Feb 08 06:02:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.