[SERVER-40605] Crashed after mongod started for uncertain period of time Created: 12/Apr/19  Updated: 16/Nov/21  Resolved: 13/May/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tianlei Mou Assignee: Geert Bosch
Resolution: Cannot Reproduce Votes: 0
Labels: SWNA
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive diagnostic.data.zip     HTML File message    
Operating System: ALL
Sprint: Storage NYC 2019-05-20
Participants:

 Description   

CentOS 7.3, no vm, no container

sometimes got signal 11, sometimes got signal 6, [edit: logs provided in comments]



 Comments   
Comment by Geert Bosch [ 13/May/19 ]

Hi Vincent, after careful analysis of the backtraces and logs that you provided, we cannot make progress on finding a root cause. If you encounter further issues, please let us know. If possible, consider using the validate command to check for consistency issues. Note that this will lock the collection while checking, so make sure to run against a node that is not required to serve other database requests.

Comment by Tianlei Mou [ 29/Apr/19 ]

Hi Geert,

This 3dis service does many things, here are something might be relative:

  • This issue happened in the period of upgrading the service, which would create new indexes on some collections that has already existed. Specifically, this happened on creating the following TTL index: 

 

db.ue_stats.createIndex({db.ue_stats.createIndex({ "time": 1},{ background: true, expireAfterSeconds: 3600 * 24 * 200});

  • This ue_stats collection has about 300 million documents. All the operations on this collection are query and upsert (update is far more than insert). During the index creating, these operations will not be paused.

We have tried to dump this collection, but failed even having used --forceTableScan. This error is the same in the log. Due to pressure from our customer, we deleted and rebuild this collection (it is intermediate, built based on other collections). This issue seems not happen again. If you still want to investigate this issue, I can provide all the information that I am allowed to.

Thanks a lot for your help.

Br,

Vincent

 

Comment by Geert Bosch [ 25/Apr/19 ]

This definitely helps in narrowing down where there are errors. Essentially this means that keys were not in ascending order in the index, which should never happen. I don't know about this 3dis service, can you give more information?

Comment by Tianlei Mou [ 25/Apr/19 ]

Hi Eric,

Here are some clues for this ticket:

  • Storage information
    • DDR4 memory
    • 3 SATA HDs as Raid 5
    • Please let me know if you need more information of hardwares
  • We do not find any memory or hardware errors in the file /var/log/message around the moment this issue happened. But we do find something very strange in this log file. Please refer to attachment "message". 3dis is a service we use to write some data into MongoDB. We got a log of this kind of "error" messages on the same day as the issue happened.
  • We found another issue on the same instance of mongod when we were trying to create an index for another collection background. The progress had been stuck at 30% for days and a large amount of messages showed in the mongod log: 

    2019-04-24T17:55:53.502+0800 I STORAGE  [conn197] WTCursor::next -- c->next_key ( RecordId(81210121)) was not greater than _lastReturnedId (RecordId(81210121)) which is a bug.
    

Do not know if this is relative. Hope it can help a little bit.

 

Comment by Eric Sedor [ 23/Apr/19 ]

Hi sh5dragon5, are you able to provide specific details about the storage subsystem for this machine, and let us know what (if any) errors are showing up in syslog?

Comment by Geert Bosch [ 19/Apr/19 ]

I looked at this. Because this is called from decodeRecordIdAtEnd, this is not impacted by any data in the buffer except for the size of the buffer and the final bytes encoding the record id. Because the validity of the RecordId does not depend on the validity of the entire buffer, it is unlikely to be caused by a logic error in KeyString. More likely there has been memory corruption in preserving the length of the buffer, or there has been corruption in the last few bytes of the buffer at a time when it was not protected by a checksum.

Such corruption can be caused by either a memory overwrite from unrelated code, or a (possibly transient) memory error. If the corruption was introduced before checksum computation, the read triggering the invariant should be repeatable. If the symptoms differ each time, it would appear that the corruption gets introduced each time after successfully reading the data. If it really was flakey memory however, I'd expect a variety of symptoms including CRC errors.

Comment by Tianlei Mou [ 18/Apr/19 ]

Thanks Eric. Here are the follow-ups.

  • _Does querying for {{ {adCode: "371702", imsi: "460015901329911"}

    }} on efs.ue_debut always cause an issue?_ - No, we tried this query for several times today, no issue.

  • Storage Engine is WiredTiger.
  • The diagnostic file is attached.
Comment by Eric Sedor [ 17/Apr/19 ]

Thanks sh5dragon5. What storage engine are you are using?

Can you also please archive (tar or zip) the $dbpath/diagnostic.data directory (the contents of this directory are described here) and attach it to this ticket?

Comment by Tianlei Mou [ 17/Apr/19 ]

Sorry Eric, I missed such an important message in the log. It's lines above.

2019-04-10T22:31:14.112+0800 I COMMAND  [conn39] command efs.ue_firstin_buff command: findAndModify { findAndModify: "ue_firstin_buff", query: { adCode: "371702", imsi: "460006718066203", time: 1518537600 }, new: true, update: { $inc: { count: 1 } }, upsert: true } planSummary: COLLSCAN update: { $inc: { count: 1 } } keysExamined:0 docsExamined:386848 nMatched:1 nModified:1 numYields:3023 reslen:179 locks:{ Global: { acquireCount: { r: 3024, w: 3024 } }, Database: { acquireCount: { w: 3024 } }, Collection: { acquireCount: { w: 3024 } } } protocol:op_query 419ms
2019-04-10T22:31:14.114+0800 I -        [conn39] Invariant failure (lastByte & 0x7) == numExtraBytes src/mongo/db/storage/key_string.cpp 1736
2019-04-10T22:31:14.114+0800 I -        [conn39] ***aborting after invariant() failure

And for your first question, I will ask our onsite ops check soon.

Thanks for your help.

Comment by Eric Sedor [ 16/Apr/19 ]

Thanks for the added information so far.

  • Does querying for {adCode: "371702", imsi: "460015901329911"} on efs.ue_debut always cause an issue?
  • Can you provide the last logged line for connection 39 (conn39) for the above Signal 6 crash?
Comment by Tianlei Mou [ 15/Apr/19 ]

2019-04-10T22:31:14.753+0800 I COMMAND [conn38] command efs.ue_firstin_buff command: findAndModify \{ findAndModify: "ue_firstin_buff", query: { adCode: "371702", imsi: "460002171744466", time: 1518537600 }, new: true, update: \{ $inc: { count: 1 } }, upsert: true } planSummary: COLLSCAN update: \{ $inc: { count: 1 } } keysExamined:0 docsExamined:633397 nMatched:0 nModified:0 upsert:1 keysInserted:1 numYields:4950 reslen:201 locks:\{ Global: { acquireCount: { r: 4951, w: 4951 } }, Database: \{ acquireCount: { w: 4951 } }, Collection: \{ acquireCount: { w: 4951 } } } protocol:op_query 735ms
2019-04-10T22:31:14.780+0800 I COMMAND [conn49] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460027650360341" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:450583 cursorExhausted:1 numYields:3521 nreturned:1 reslen:270 locks:\{ Global: { acquireCount: { r: 7044 } }, Database: \{ acquireCount: { r: 3522 } }, Collection: \{ acquireCount: { r: 3522 } } } protocol:op_query 596ms
2019-04-10T22:31:14.807+0800 I COMMAND [conn42] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460022650666117" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:87787 cursorExhausted:1 numYields:685 nreturned:1 reslen:276 locks:\{ Global: { acquireCount: { r: 1372 } }, Database: \{ acquireCount: { r: 686 } }, Collection: \{ acquireCount: { r: 686 } } } protocol:op_query 117ms
2019-04-10T22:31:14.814+0800 I WRITE [conn81] update efs.runoff_stats_wifimac query: \{ domainGroup: "3717020027", time: new Date(1554933600000) } planSummary: IXSCAN \{ time: 1, domainGroup: 1 } update: \{ $inc: { count: 1 } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 writeConflicts:1 numYields:1 locks:\{ Global: { acquireCount: { r: 2, w: 2 } }, Database: \{ acquireCount: { w: 2 } }, Collection: \{ acquireCount: { w: 2 } } } 329ms
2019-04-10T22:31:14.814+0800 I COMMAND [conn81] command efs.$cmd command: update \{ update: "runoff_stats_wifimac", ordered: true, updates: [ { q: { domainGroup: "3717020027", time: new Date(1554933600000) }, u: \{ $inc: { count: 1 } }, multi: false, upsert: true } ] } numYields:0 reslen:59 locks:\{ Global: { acquireCount: { r: 2, w: 2 } }, Database: \{ acquireCount: { w: 2 } }, Collection: \{ acquireCount: { w: 2 } } } protocol:op_query 329ms
2019-04-10T22:31:14.823+0800 I COMMAND [conn45] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460110324173973" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:393034 cursorExhausted:1 numYields:3071 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 6144 } }, Database: \{ acquireCount: { r: 3072 } }, Collection: \{ acquireCount: { r: 3072 } } } protocol:op_query 423ms
2019-04-10T22:31:14.906+0800 I COMMAND [conn141] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460012075428006" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:297888 cursorExhausted:1 numYields:2328 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 4658 } }, Database: \{ acquireCount: { r: 2329 } }, Collection: \{ acquireCount: { r: 2329 } } } protocol:op_query 307ms
2019-04-10T22:31:14.938+0800 I COMMAND [conn50] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460015291112371" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:292081 cursorExhausted:1 numYields:2283 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 4568 } }, Database: \{ acquireCount: { r: 2284 } }, Collection: \{ acquireCount: { r: 2284 } } } protocol:op_query 351ms
2019-04-10T22:31:14.958+0800 I COMMAND [conn152] command efs.ue_firstin_buff command: findAndModify \{ findAndModify: "ue_firstin_buff", query: { adCode: "371702", imsi: "460013005681850", time: 1518537600 }, new: true, update: \{ $inc: { count: 1 } }, upsert: true } planSummary: COLLSCAN update: \{ $inc: { count: 1 } } keysExamined:0 docsExamined:496898 nMatched:1 nModified:1 numYields:3883 reslen:179 locks:\{ Global: { acquireCount: { r: 3884, w: 3884 } }, Database: \{ acquireCount: { w: 3884 } }, Collection: \{ acquireCount: { w: 3884 } } } protocol:op_query 575ms
2019-04-10T22:31:14.965+0800 F - [conn39] *Got signal: 6* (Aborted).
 
0x7f66c4c74f81 0x7f66c4c74199 0x7f66c4c7467d 0x7f66c235e370 0x7f66c1fc31d7 0x7f66c1fc48c8 0x7f66c3f1cf14 0x7f66c4891ee0 0x7f66c4891f5c 0x7f66c495a2fe 0x7f66c4961d79 0x7f66c4962712 0x7f66c429c227 0x7f66c429c4df 0x7f66c42aca13 0x7f66c428b06e 0x7f66c42aca13 0x7f66c42a0c96 0x7f66c42aca13 0x7f66c427e688 0x7f66c45b2f92 0x7f66c45b5338 0x7f66c45b5fec 0x7f66c456f4b2 0x7f66c457001b 0x7f66c4198238 0x7f66c416f02f 0x7f66c4170711 0x7f66c4788a90 0x7f66c438c162 0x7f66c438e146 0x7f66c3f8c90d 0x7f66c3f8d23d 0x7f66c4bdcef1 0x7f66c2356dc5 0x7f66c208573d
----- BEGIN BACKTRACE -----
{"backtrace":[\{"b":"7F66C3702000","o":"1572F81","s":"_ZN5mongo15printStackTraceERSo"},\{"b":"7F66C3702000","o":"1572199"},\{"b":"7F66C3702000","o":"157267D"},\{"b":"7F66C234F000","o":"F370"},\{"b":"7F66C1F8E000","o":"351D7","s":"gsignal"},\{"b":"7F66C1F8E000","o":"368C8","s":"abort"},\{"b":"7F66C3702000","o":"81AF14","s":"_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j"},\{"b":"7F66C3702000","o":"118FEE0","s":"_ZN5mongo9KeyString14decodeRecordIdEPNS_9BufReaderE"},\{"b":"7F66C3702000","o":"118FF5C","s":"_ZN5mongo9KeyString19decodeRecordIdAtEndEPKvm"},\{"b":"7F66C3702000","o":"12582FE"},\{"b":"7F66C3702000","o":"125FD79"},\{"b":"7F66C3702000","o":"1260712"},\{"b":"7F66C3702000","o":"B9A227","s":"_ZN5mongo9IndexScan13initIndexScanEv"},\{"b":"7F66C3702000","o":"B9A4DF","s":"_ZN5mongo9IndexScan6doWorkEPm"},\{"b":"7F66C3702000","o":"BAAA13","s":"_ZN5mongo9PlanStage4workEPm"},\{"b":"7F66C3702000","o":"B8906E","s":"_ZN5mongo10FetchStage6doWorkEPm"},\{"b":"7F66C3702000","o":"BAAA13","s":"_ZN5mongo9PlanStage4workEPm"},\{"b":"7F66C3702000","o":"B9EC96","s":"_ZN5mongo10LimitStage6doWorkEPm"},\{"b":"7F66C3702000","o":"BAAA13","s":"_ZN5mongo9PlanStage4workEPm"},\{"b":"7F66C3702000","o":"B7C688","s":"_ZN5mongo15CachedPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE"},\{"b":"7F66C3702000","o":"EB0F92","s":"_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyEPKNS_10CollectionE"},\{"b":"7F66C3702000","o":"EB3338","s":"_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS0_11YieldPolicyE"},\{"b":"7F66C3702000","o":"EB3FEC","s":"_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionENS0_11YieldPolicyE"},\{"b":"7F66C3702000","o":"E6D4B2","s":"_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS5_EENS_12PlanExecutor11YieldPolicyEm"},\{"b":"7F66C3702000","o":"E6E01B","s":"_ZN5mongo15getExecutorFindEPNS_16OperationContextEPNS_10CollectionERKNS_15NamespaceStringESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS8_EENS_12PlanExecutor11YieldPolicyE"},\{"b":"7F66C3702000","o":"A96238","s":"_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE"},\{"b":"7F66C3702000","o":"A6D02F","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},\{"b":"7F66C3702000","o":"A6E711","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},\{"b":"7F66C3702000","o":"1086A90","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},\{"b":"7F66C3702000","o":"C8A162"},\{"b":"7F66C3702000","o":"C8C146","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},\{"b":"7F66C3702000","o":"88A90D","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},\{"b":"7F66C3702000","o":"88B23D"},\{"b":"7F66C3702000","o":"14DAEF1"},\{"b":"7F66C234F000","o":"7DC5"},\{"b":"7F66C1F8E000","o":"F773D","s":"clone"}],"processInfo":\{ "mongodbVersion" : "3.4.7", "gitVersion" : "cf38c1b8a0a8dca4a11737581beafef4fe120bcd", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-123.el7.x86_64", "version" : "#1 SMP Mon Jun 30 12:09:22 UTC 2014", "machine" : "x86_64" }, "somap" : [ \{ "b" : "7F66C3702000", "elfType" : 3, "buildId" : "433E85C3D902A85D7DFFD7F281B1A2C48A7ED7CD" }, \{ "b" : "7FFFD4A85000", "elfType" : 3, "buildId" : "D7952DC468957C2B14B6BB79E613D48BA1224706" }, \{ "b" : "7F66C3274000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58FEDFFED1A388AD9E495F9A6C91A851B9537765" }, \{ "b" : "7F66C2E8F000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "F214E8640FDA5097E7A90CE7974B3FF76C6C42D9" }, \{ "b" : "7F66C2C87000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "82E77ADE22BC9FFF8D3458BD37331E7EDF174C28" }, \{ "b" : "7F66C2A83000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "C5F560504E1AF52E29679C3B52FF11121015D6BB" }, \{ "b" : "7F66C2781000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "721C7CC9488EFA25F83B48AF713AB27DBE48EF3E" }, \{ "b" : "7F66C256B000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "408B46E291B2D4C9612E27C0509D165D7E186D40" }, \{ "b" : "7F66C234F000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C3DEB1FA27CD0C1C3CC575B944ABACBA0698B0F2" }, \{ "b" : "7F66C1F8E000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8B2C421716985B927AA0CAF2A05D0B1F452367F7" }, \{ "b" : "7F66C34E0000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, \{ "b" : "7F66C1D44000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "641A441AB91715A7E3AF8AD9AF38EE07F17866FE" }, \{ "b" : "7F66C1A64000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "08E8BA638E79EC07F98198ED40F90FA87D5EEEB5" }, \{ "b" : "7F66C1860000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "D2678F5F391BF2877E1BD6FAD16DBC589ED0BBF3" }, \{ "b" : "7F66C162B000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "8269E77C68B707158D2B1BEA356EE0FC2A1C0024" }, \{ "b" : "7F66C1415000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "E45643F27F3B3E960F3691AFC6EC27A98EF7B46B" }, \{ "b" : "7F66C1207000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "577A21CDAA3D662B87D53AFAA12A1E7B34AD513F" }, \{ "b" : "7F66C1003000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, \{ "b" : "7F66C0DE9000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "FE7AE845A123A3DFC0FDC2408BCBC2BA8B61B158" }, \{ "b" : "7F66C0BC4000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, \{ "b" : "7F66C0963000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "B19961A753FDFF85BD071340139A7F024BAEFFCA" }, \{ "b" : "7F66C073E000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "218D03D1F6CF1A099A4D467B5E8ECF4F2BF45750" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f66c4c74f81]
 mongod(+0x1572199) [0x7f66c4c74199]
 mongod(+0x157267D) [0x7f66c4c7467d]
 libpthread.so.0(+0xF370) [0x7f66c235e370]
 libc.so.6(gsignal+0x37) [0x7f66c1fc31d7]
 libc.so.6(abort+0x148) [0x7f66c1fc48c8]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x7f66c3f1cf14]
 mongod(_ZN5mongo9KeyString14decodeRecordIdEPNS_9BufReaderE+0x280) [0x7f66c4891ee0]
 mongod(_ZN5mongo9KeyString19decodeRecordIdAtEndEPKvm+0x4C) [0x7f66c4891f5c]
 mongod(+0x12582FE) [0x7f66c495a2fe]
 mongod(+0x125FD79) [0x7f66c4961d79]
 mongod(+0x1260712) [0x7f66c4962712]
 mongod(_ZN5mongo9IndexScan13initIndexScanEv+0x2C7) [0x7f66c429c227]
 mongod(_ZN5mongo9IndexScan6doWorkEPm+0x14F) [0x7f66c429c4df]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x7f66c42aca13]
 mongod(_ZN5mongo10FetchStage6doWorkEPm+0x29E) [0x7f66c428b06e]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x7f66c42aca13]
 mongod(_ZN5mongo10LimitStage6doWorkEPm+0x76) [0x7f66c42a0c96]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x7f66c42aca13]
 mongod(_ZN5mongo15CachedPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE+0x198) [0x7f66c427e688]
 mongod(_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyEPKNS_10CollectionE+0xF2) [0x7f66c45b2f92]
 mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS0_11YieldPolicyE+0x2D8) [0x7f66c45b5338]
 mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionENS0_11YieldPolicyE+0xEC) [0x7f66c45b5fec]
 mongod(_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS5_EENS_12PlanExecutor11YieldPolicyEm+0x132) [0x7f66c456f4b2]
 mongod(_ZN5mongo15getExecutorFindEPNS_16OperationContextEPNS_10CollectionERKNS_15NamespaceStringESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS8_EENS_12PlanExecutor11YieldPolicyE+0x8B) [0x7f66c457001b]
 mongod(_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE+0xC98) [0x7f66c4198238]
 mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x4FF) [0x7f66c416f02f]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xF81) [0x7f66c4170711]
 mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x240) [0x7f66c4788a90]
 mongod(+0xC8A162) [0x7f66c438c162]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x746) [0x7f66c438e146]
 mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x7f66c3f8c90d]
 mongod(+0x88B23D) [0x7f66c3f8d23d]
 mongod(+0x14DAEF1) [0x7f66c4bdcef1]
 libpthread.so.0(+0x7DC5) [0x7f66c2356dc5]
 libc.so.6(clone+0x6D) [0x7f66c208573d]
----- END BACKTRACE -----

Comment by Tianlei Mou [ 15/Apr/19 ]

2019-04-08T18:23:23.037+0800 I COMMAND [conn251] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460027645227389" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:386349 cursorExhausted:1 numYields:3019 nreturned:0 reslen:100 locks:\{ Global: { acquireCount: { r: 6040 } }, Database: \{ acquireCount: { r: 3020 } }, Collection: \{ acquireCount: { r: 3020 } } } protocol:op_query 402ms
2019-04-08T18:23:23.059+0800 I COMMAND [conn234] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460061061032659" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:122171 cursorExhausted:1 numYields:955 nreturned:1 reslen:276 locks:\{ Global: { acquireCount: { r: 1912 } }, Database: \{ acquireCount: { r: 956 } }, Collection: \{ acquireCount: { r: 956 } } } protocol:op_query 124ms
2019-04-08T18:23:23.078+0800 I COMMAND [conn232] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460077260132225" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:111301 cursorExhausted:1 numYields:870 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 1742 } }, Database: \{ acquireCount: { r: 871 } }, Collection: \{ acquireCount: { r: 871 } } } protocol:op_query 128ms
2019-04-08T18:23:23.126+0800 I COMMAND [conn230] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460011496101423" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:180251 cursorExhausted:1 numYields:1408 nreturned:1 reslen:270 locks:\{ Global: { acquireCount: { r: 2818 } }, Database: \{ acquireCount: { r: 1409 } }, Collection: \{ acquireCount: { r: 1409 } } } protocol:op_query 214ms
2019-04-08T18:23:23.322+0800 I COMMAND [conn230] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460016528105739" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:140643 cursorExhausted:1 numYields:1098 nreturned:1 reslen:276 locks:\{ Global: { acquireCount: { r: 2198 } }, Database: \{ acquireCount: { r: 1099 } }, Collection: \{ acquireCount: { r: 1099 } } } protocol:op_query 194ms
2019-04-08T18:23:23.339+0800 I COMMAND [conn251] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460003050758429" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:245790 cursorExhausted:1 numYields:1921 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 3844 } }, Database: \{ acquireCount: { r: 1922 } }, Collection: \{ acquireCount: { r: 1922 } } } protocol:op_query 300ms
2019-04-08T18:23:23.366+0800 I COMMAND [conn235] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460012365405799" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:255708 cursorExhausted:1 numYields:1998 nreturned:1 reslen:276 locks:\{ Global: { acquireCount: { r: 3998 } }, Database: \{ acquireCount: { r: 1999 } }, Collection: \{ acquireCount: { r: 1999 } } } protocol:op_query 306ms
2019-04-08T18:23:23.389+0800 I COMMAND [conn232] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460003060796992" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:113246 cursorExhausted:1 numYields:884 nreturned:1 reslen:276 locks:\{ Global: { acquireCount: { r: 1770 } }, Database: \{ acquireCount: { r: 885 } }, Collection: \{ acquireCount: { r: 885 } } } protocol:op_query 152ms
2019-04-08T18:23:23.480+0800 I COMMAND [conn151] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460095300102085" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:359237 cursorExhausted:1 numYields:2808 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 5618 } }, Database: \{ acquireCount: { r: 2809 } }, Collection: \{ acquireCount: { r: 2809 } } } protocol:op_query 474ms
2019-04-08T18:23:23.514+0800 I COMMAND [conn234] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460028651699331" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:358403 cursorExhausted:1 numYields:2801 nreturned:1 reslen:267 locks:\{ Global: { acquireCount: { r: 5604 } }, Database: \{ acquireCount: { r: 2802 } }, Collection: \{ acquireCount: { r: 2802 } } } protocol:op_query 435ms
2019-04-08T18:23:23.547+0800 F - [conn151] Invalid access at address: 0xffffffffffffffff
2019-04-08T18:23:23.558+0800 I COMMAND [conn236] command efs.ue_debut command: find \{ find: "ue_debut", filter: { adCode: "371702", imsi: "460015901329911" }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:164064 cursorExhausted:1 numYields:1282 nreturned:1 reslen:273 locks:\{ Global: { acquireCount: { r: 2566 } }, Database: \{ acquireCount: { r: 1283 } }, Collection: \{ acquireCount: { r: 1283 } } } protocol:op_query 190ms
2019-04-08T18:23:23.559+0800 F - [conn151] *Got signal: 11* (Segmentation fault).
 
0x7ff1f8f10f81 0x7ff1f8f10199 0x7ff1f8f10806 0x7ff1f65fa370 0x7ff1f855883f 0x7ff1f8548a38 0x7ff1f884f15a 0x7ff1f884fa7b 0x7ff1f843460b 0x7ff1f840b02f 0x7ff1f840c711 0x7ff1f8a24a90 0x7ff1f8628162 0x7ff1f862a146 0x7ff1f822890d 0x7ff1f822923d 0x7ff1f8e78ef1 0x7ff1f65f2dc5 0x7ff1f632173d
----- BEGIN BACKTRACE -----
{"backtrace":[\{"b":"7FF1F799E000","o":"1572F81","s":"_ZN5mongo15printStackTraceERSo"},\{"b":"7FF1F799E000","o":"1572199"},\{"b":"7FF1F799E000","o":"1572806"},\{"b":"7FF1F65EB000","o":"F370"},\{"b":"7FF1F799E000","o":"BBA83F","s":"_ZN5mongo11ScopedTimerD1Ev"},\{"b":"7FF1F799E000","o":"BAAA38","s":"_ZN5mongo9PlanStage4workEPm"},\{"b":"7FF1F799E000","o":"EB115A","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},\{"b":"7FF1F799E000","o":"EB1A7B","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},\{"b":"7FF1F799E000","o":"A9660B","s":"_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE"},\{"b":"7FF1F799E000","o":"A6D02F","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},\{"b":"7FF1F799E000","o":"A6E711","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},\{"b":"7FF1F799E000","o":"1086A90","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},\{"b":"7FF1F799E000","o":"C8A162"},\{"b":"7FF1F799E000","o":"C8C146","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},\{"b":"7FF1F799E000","o":"88A90D","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},\{"b":"7FF1F799E000","o":"88B23D"},\{"b":"7FF1F799E000","o":"14DAEF1"},\{"b":"7FF1F65EB000","o":"7DC5"},\{"b":"7FF1F622A000","o":"F773D","s":"clone"}],"processInfo":\{ "mongodbVersion" : "3.4.7", "gitVersion" : "cf38c1b8a0a8dca4a11737581beafef4fe120bcd", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-123.el7.x86_64", "version" : "#1 SMP Mon Jun 30 12:09:22 UTC 2014", "machine" : "x86_64" }, "somap" : [ \{ "b" : "7FF1F799E000", "elfType" : 3, "buildId" : "433E85C3D902A85D7DFFD7F281B1A2C48A7ED7CD" }, \{ "b" : "7FFFB55C3000", "elfType" : 3, "buildId" : "D7952DC468957C2B14B6BB79E613D48BA1224706" }, \{ "b" : "7FF1F7510000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58FEDFFED1A388AD9E495F9A6C91A851B9537765" }, \{ "b" : "7FF1F712B000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "F214E8640FDA5097E7A90CE7974B3FF76C6C42D9" }, \{ "b" : "7FF1F6F23000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "82E77ADE22BC9FFF8D3458BD37331E7EDF174C28" }, \{ "b" : "7FF1F6D1F000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "C5F560504E1AF52E29679C3B52FF11121015D6BB" }, \{ "b" : "7FF1F6A1D000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "721C7CC9488EFA25F83B48AF713AB27DBE48EF3E" }, \{ "b" : "7FF1F6807000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "408B46E291B2D4C9612E27C0509D165D7E186D40" }, \{ "b" : "7FF1F65EB000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C3DEB1FA27CD0C1C3CC575B944ABACBA0698B0F2" }, \{ "b" : "7FF1F622A000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8B2C421716985B927AA0CAF2A05D0B1F452367F7" }, \{ "b" : "7FF1F777C000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, \{ "b" : "7FF1F5FE0000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "641A441AB91715A7E3AF8AD9AF38EE07F17866FE" }, \{ "b" : "7FF1F5D00000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "08E8BA638E79EC07F98198ED40F90FA87D5EEEB5" }, \{ "b" : "7FF1F5AFC000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "D2678F5F391BF2877E1BD6FAD16DBC589ED0BBF3" }, \{ "b" : "7FF1F58C7000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "8269E77C68B707158D2B1BEA356EE0FC2A1C0024" }, \{ "b" : "7FF1F56B1000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "E45643F27F3B3E960F3691AFC6EC27A98EF7B46B" }, \{ "b" : "7FF1F54A3000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "577A21CDAA3D662B87D53AFAA12A1E7B34AD513F" }, \{ "b" : "7FF1F529F000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, \{ "b" : "7FF1F5085000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "FE7AE845A123A3DFC0FDC2408BCBC2BA8B61B158" }, \{ "b" : "7FF1F4E60000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "82FF6B18E1E42825CC2D060F969479AD4AF2F62C" }, \{ "b" : "7FF1F4BFF000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "B19961A753FDFF85BD071340139A7F024BAEFFCA" }, \{ "b" : "7FF1F49DA000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "218D03D1F6CF1A099A4D467B5E8ECF4F2BF45750" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7ff1f8f10f81]
 mongod(+0x1572199) [0x7ff1f8f10199]
 mongod(+0x1572806) [0x7ff1f8f10806]
 libpthread.so.0(+0xF370) [0x7ff1f65fa370]
 mongod(_ZN5mongo11ScopedTimerD1Ev+0xF) [0x7ff1f855883f]
 mongod(_ZN5mongo9PlanStage4workEPm+0x88) [0x7ff1f8548a38]
 mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x19A) [0x7ff1f884f15a]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x4B) [0x7ff1f884fa7b]
 mongod(_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE+0x106B) [0x7ff1f843460b]
 mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x4FF) [0x7ff1f840b02f]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xF81) [0x7ff1f840c711]
 mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x240) [0x7ff1f8a24a90]
 mongod(+0xC8A162) [0x7ff1f8628162]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x746) [0x7ff1f862a146]
 mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x7ff1f822890d]
 mongod(+0x88B23D) [0x7ff1f822923d]
 mongod(+0x14DAEF1) [0x7ff1f8e78ef1]
 libpthread.so.0(+0x7DC5) [0x7ff1f65f2dc5]
 libc.so.6(clone+0x6D) [0x7ff1f632173d]
----- END BACKTRACE -----

Comment by Eric Sedor [ 12/Apr/19 ]

Thanks for your report. Are you able to provide the logs preceding the stack trace for both the signal 11 and signal 6 cases?

Generated at Thu Feb 08 04:55:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.