[SERVER-33176] mongodb crash with Got signal: 11 (Segmentation fault). Created: 07/Feb/18  Updated: 28/Nov/18  Resolved: 17/Aug/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.6.2
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Joe [X] Assignee: ADAM Martin (Inactive)
Resolution: Incomplete Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongod.log    
Issue Links:
Related
related to WT-4037 WT_REF structures freed while still i... Closed
Operating System: ALL
Steps To Reproduce:

I'm not sure how to reproduce. I've been loading a pretty large dataset into a cluster. Network has been stable, not sure what could have caused this.

Participants:

 Description   

The following took place on a shard, of a 7 node cluster. 3 config, 3 shard 1 router.

The error took place while using mongoimport.

I noticed other bugs reported for 3.4, but I'm using 3.6.2. The other 3.4 bug reports recommended upgrading.

I'm running xenial on arm64.

Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial

2018-02-07T18:58:35.253+0000 F -        [Collection Range Deleter] Invalid access at address: 0xb2c3a44e5d2be4d9
2018-02-07T18:58:35.339+0000 F -        [Collection Range Deleter] Got signal: 11 (Segmentation fault).
 
 0x557edd2fd0 0x557edd2280 0x557edd28e0 0x7f8c9476c0 0x7f8c5d1e3c
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"557CF3D000","o":"1E95FD0","s":"_ZN5mongo15printStackTraceERSo"},{"b":"557CF3D000","o":"1E95280"},{"b":"557CF3D000","o":"1E958E0"},{"b":"7F8C947000","o":"6C0","s":"__kernel_rt_sigreturn"},{"b":"7F8C5C5000","o":"CE3C","s":"pthread_cond_destroy"}],"processInfo":{ "mongodbVersion" : "3.6.2", "gitVersion" : "489d177dbd0f0420a8ca04d39fd78d0a2c539420", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.103-rockchip-ayufan-175", "version" : "#1 SMP Thu Jan 11 16:10:41 UTC 2018", "machine" : "aarch64" }, "somap" : [ { "b" : "557CF3D000", "elfType" : 3, "buildId" : "2216AD75BB3D68D98E9E344DE66B8DCA80788007" }, { "b" : "7F8C947000", "elfType" : 3, "buildId" : "46672A308C2699051AF475666C327086C6412CC0" }, { "b" : "7F8C8F7000", "path" : "/lib/aarch64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "ED7A2662858A4F87391BAE87D2D8BBBB7C406D54" }, { "b" : "7F8C88D000", "path" : "/lib/aarch64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FE1F68955FAF388EC7A72F8B9EAA75CCC0603B94" }, { "b" : "7F8C6E9000", "path" : "/lib/aarch64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "2AA55047FC1C4C87C0F3688195009316AE78E0AE" }, { "b" : "7F8C6D2000", "path" : "/lib/aarch64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "FFAE838399016D9BEDEF5C85D772A7B1EF228127" }, { "b" : "7F8C6BF000", "path" : "/lib/aarch64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "A14FF51344A5D7E13CAC658D3D7863C7E5B6DDBD" }, { "b" : "7F8C612000", "path" : "/lib/aarch64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "78D5BDD3E860836F91FEAD99000A586662CA92D8" }, { "b" : "7F8C5F1000", "path" : "/lib/aarch64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "A05222936FB1A047AC0567B0861730C781C69A23" }, { "b" : "7F8C5C5000", "path" : "/lib/aarch64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "629C063061FD5A4EAA215BFDE0EC8C02F40893F0" }, { "b" : "7F8C47E000", "path" : "/lib/aarch64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "5A4C72FADB14FE1891F8A6EA77275E1B7F52D820" }, { "b" : "7F8C91C000", "path" : "/lib/ld-linux-aarch64.so.1", "elfType" : 3, "buildId" : "42369A5C3B1A57397BDA983454CDE519F7F8DEC5" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x48) [0x557edd2fd0]
 mongod(+0x1E95280) [0x557edd2280]
 mongod(+0x1E958E0) [0x557edd28e0]
 (__kernel_rt_sigreturn+0x0) [0x7f8c9476c0]
 libpthread.so.0(pthread_cond_destroy+0x1C) [0x7f8c5d1e3c]
-----  END BACKTRACE  -----



 Comments   
Comment by Manan Shah [ 28/Nov/18 ]

Thanks, I opened SERVER-38292

Comment by Ramon Fernandez Marina [ 28/Nov/18 ]

manan@indeed.com, the stack trace you posted is different from the one leading up to WT-4037 – will you please open a separate ticket so we can take a look?

Thanks,
Ramón.

Comment by Manan Shah [ 28/Nov/18 ]

We are also having this issue both on 3.2.19 and 3.4.14 (both are WT engine). Here's the stack from 3.4.14 logs. Can you brief me on what exactly is the fix in WT-4037. It is not clear in that ticket.

2018-11-28T10:06:13.885-0600 F -        [thread2] Invalid access at address: 0
2018-11-28T10:06:13.906-0600 F -        [thread2] Got signal: 11 (Segmentation fault). 0x562636fa4a51 0x562636fa3c69 0x562636fa42d6 0x7f74ee9647e0 0x56263716bca3 0x56263716bd7c 0x562637aa1d7a 0x5626378e0466 0x5626378e05ea 0x5626378e0c08 0x5626378e3ac3 0x562637934294 0x562637934e47 0x56263792f813 0x56263792fba7 0x5626379316a3 0x56263799b906 0x7f74ee95caa1 0x7f74ee6a9c4d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"562635A2B000","o":"1579A51","s":"_ZN5mongo15printStackTraceERSo"},{"b":"562635A2B000","o":"1578C69"},{"b":"562635A2B000","o":"15792D6"},{"b":"7F74EE955000","o":"F7E0"},{"b":"562635A2B000","o":"1740CA3","s":"_ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEmi"},{"b":"562635A2B000","o":"1740D7C","s":"_ZN8tcmalloc11ThreadCache11ListTooLongEPNS0_8FreeListEm"},{"b":"562635A2B000","o":"2076D7A","s":"_ZdlPvRKSt9nothrow_t"},{"b":"562635A2B000","o":"1EB5466","s":"__wt_split_stash_discard"},{"b":"562635A2B000","o":"1EB55EA"},{"b":"562635A2B000","o":"1EB5C08"},{"b":"562635A2B000","o":"1EB8AC3","s":"__wt_split_reverse"},{"b":"562635A2B000","o":"1F09294"},{"b":"562635A2B000","o":"1F09E47","s":"__wt_evict"},{"b":"562635A2B000","o":"1F04813"},{"b":"562635A2B000","o":"1F04BA7"},{"b":"562635A2B000","o":"1F066A3","s":"__wt_evict_thread_run"},{"b":"562635A2B000","o":"1F70906","s":"__wt_thread_run"},{"b":"7F74EE955000","o":"7AA1"},{"b":"7F74EE5C1000","o":"E8C4D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.14", "gitVersion" : "fd954412dfc10e4d1e3e2dd4fac040f8b476b268", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "2.6.32-754.6.3.el6.x86_64", "version" : "#1 SMP Tue Oct 9 17:27:49 UTC 2018", "machine" : "x86_64" }, "somap" : [ { "b" : "562635A2B000", "elfType" : 3, "buildId" : "64CD384C41ACC8D81741F0DFC0F9A3D7756F81FF" }, { "b" : "7FFF3E8F6000", "elfType" : 3, "buildId" : "F9F48CC73D4D61AE273899B31855C6589EE5EA8D" }, { "b" : "7F74EF7FD000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "BECFB85A8BC084042D5BF2BA9E66325CE798B659" }, { "b" : "7F74EF418000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "CBDA444A7109874C5350AE9CEEF3F82F749B347F" }, { "b" : "7F74EF210000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "552CEC3216281CCFD7FA6432C723D50163255823" }, { "b" : "7F74EF00C000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "2AF795BFFD122309BA3359FEBABB5D0967403D17" }, { "b" : "7F74EED88000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "4AAEE970B045D8BF946578B9C7F3AB5CDE9AB44A" }, { "b" : "7F74EEB72000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "EDC925E58FE28DCA536993EB13179C739F1E6566" }, { "b" : "7F74EE955000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "4EA475CD3FD3B69B6C95D9381FA74B36DB4992EF" }, { "b" : "7F74EE5C1000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "BCA7789C2EA8E28CB7CE553E183AC7E7EE36F8A2" }, { "b" : "7F74EFA69000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "97AF4B77212F74CFF72B6C013E6AA2D74A97EF60" }, { "b" : "7F74EE37D000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "9A737F8BF10FC99C37CC404D3FC188F6E11FEDD9" }, { "b" : "7F74EE096000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "8D3D6E28DF6EB3752642A7031AAC17D39EA4265D" }, { "b" : "7F74EDE92000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "7EC54D6E88BB7D2C1284117C2A483496A01EAAF4" }, { "b" : "7F74EDC66000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "CC89B4C8CDCCD32BA610BC72784DC3B7E9BD9E19" }, { "b" : "7F74EDA50000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7F74ED845000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "E0C522C589F775C324330BE09CE67DC83950A213" }, { "b" : "7F74ED642000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7F74ED428000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "4786A2A5D30B121601958E84D643C70C13C4FBA5" }, { "b" : "7F74ED209000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "B4576BE308DDCF7BC31F7304E4734C3D846D0236" } ] }}
 mongod-3.4(_ZN5mongo15printStackTraceERSo+0x41) [0x562636fa4a51]
 mongod-3.4(+0x1578C69) [0x562636fa3c69]
 mongod-3.4(+0x15792D6) [0x562636fa42d6]
 libpthread.so.0(+0xF7E0) [0x7f74ee9647e0]
 mongod-3.4(_ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEmi+0xE3) [0x56263716bca3]
 mongod-3.4(_ZN8tcmalloc11ThreadCache11ListTooLongEPNS0_8FreeListEm+0x1C) [0x56263716bd7c]
 mongod-3.4(_ZdlPvRKSt9nothrow_t+0x26A) [0x562637aa1d7a]
 mongod-3.4(__wt_split_stash_discard+0xC6) [0x5626378e0466]
 mongod-3.4(+0x1EB55EA) [0x5626378e05ea]
 mongod-3.4(+0x1EB5C08) [0x5626378e0c08]
 mongod-3.4(__wt_split_reverse+0x83) [0x5626378e3ac3]
 mongod-3.4(+0x1F09294) [0x562637934294]
 mongod-3.4(__wt_evict+0xAC7) [0x562637934e47]
 mongod-3.4(+0x1F04813) [0x56263792f813]
 mongod-3.4(+0x1F04BA7) [0x56263792fba7]
 mongod-3.4(__wt_evict_thread_run+0xD3) [0x5626379316a3]
 mongod-3.4(__wt_thread_run+0x16) [0x56263799b906]
 libpthread.so.0(+0x7AA1) [0x7f74ee95caa1]
 libc.so.6(clone+0x6D) [0x7f74ee6a9c4d]
-----  END BACKTRACE  -----
 

Comment by ADAM Martin (Inactive) [ 17/Aug/18 ]

There is insufficient information to diagnose this problem, at this time.  Weaver, if you can get us some more diagnostic information for this problem – a core dump, or some log files, or similar – then re-open this ticket with that supplementary information.

Comment by Andrew Morrow (Inactive) [ 19/Jul/18 ]

adam.martin - Please provide an update.

Comment by Kelsey Schubert [ 17/Apr/18 ]

Hi stutiredboy@gmail.com,

Thanks for your report, please note that the segmentation fault you report is unrelated to the original issue this ticket describes. In this case, the issue you're encountering is being tracked in WT-4037, and I would recommend that you watch WT-4037 for updates about its fix.

Kind regards,
Kelsey

Comment by Adun [ 17/Apr/18 ]

I ran into this exception when I ran ycsb test.

YCSB:
1. just inserts, no read/no update/no scan/no read modify
2. threadcount is 50
3. mongodb java driver is 3.6.3
4. not 100% crashed.

OS:
1. Debian 9 amd64 with XFS;
2. disk is SSD
3. limits 3 cpu + 25G memory, wiredTigerCacheSizeGB = 16
4. MongoD 3.6.4 binary from download.mongod.com
5. MongoD run in standalone mode
6. ulimit as below:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 386157
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 65535
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Network:
1. 10gb Ethernet
2. server and client are in the same subnet
3. ping latency below 0.04 ms

crash logs below:

2018-04-17T12:42:06.261155+08:00 [conn37] Invalid access at address: 0
2018-04-17T12:42:06.471613+08:00 [conn37] Got signal: 11 (Segmentation fault).
 0x56314c704b71 0x56314c703d89 0x56314c7043f6 0x7f840df050c0 0x56314b003ad4 0x56314b026ef6 0x56314b097007 0x56314b03fe56 0x56314af55a56 0x56314af55ce4 0x56314b217e34 0x56314b218444 0x56314b1fccda 0x56314b2027b5 0x56314b1e8e9e 0x56314b1e2b08 0x56314c078a4f 0x56314b12a0b1 0x56314b12ba24 0x56314b12c777 0x56314b138faa 0x56314b134957 0x56314b137d91 0x56314c039092 0x56314b1337bf 0x56314b135d05 0x56314b1365fb 0x56314b1349dd 0x56314b137d91 0x56314c0395f5 0x56314c5ce194 0x7f840defb494 0x7f840dc3dacf
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"56314A526000","o":"21DEB71","s":"_ZN5mongo15printStackTraceERSo"},{"b":"56314A526000","o":"21DDD89"},{"b":"56314A526000","o":"21DE3F6"},{"b":"7F840DEF4000","o":"110C0"},{"b":"56314A526000","o":"ADDAD4","s":"__wt_page_in_func"},{"b":"56314A526000","o":"B00EF6","s":"__wt_row_search"},{"b":"56314A526000","o":"B71007","s":"__wt_btcur_insert"},{"b":"56314A526000","o":"B19E56"},{"b":"56314A526000","o":"A2FA56","s":"_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEPKNS_9TimestampEm"},{"b":"56314A526000","o":"A2FCE4","s":"_ZN5mongo21WiredTigerRecordStore13insertRecordsEPNS_16OperationContextEPSt6vectorINS_6RecordESaIS4_EEPS3_INS_9TimestampESaIS8_EEb"},{"b":"56314A526000","o":"CF1E34","s":"_ZN5mongo14CollectionImpl16_insertDocumentsEPNS_16OperationContextEN9__gnu_cxx17__normal_iteratorIPKNS_15InsertStatementESt6vectorIS5_SaIS5_EEEESB_bPNS_7OpDebugE"},{"b":"56314A526000","o":"CF2444","s":"_ZN5mongo14CollectionImpl15insertDocumentsEPNS_16OperationContextEN9__gnu_cxx17__normal_iteratorIPKNS_15InsertStatementESt6vectorIS5_SaIS5_EEEESB_PNS_7OpDebugEbb"},{"b":"56314A526000","o":"CD6CDA"},{"b":"56314A526000","o":"CDC7B5","s":"_ZN5mongo14performInsertsEPNS_16OperationContextERKNS_9write_ops6InsertE"},{"b":"56314A526000","o":"CC2E9E"},{"b":"56314A526000","o":"CBCB08"},{"b":"56314A526000","o":"1B52A4F","s":"_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE"},{"b":"56314A526000","o":"C040B1"},{"b":"56314A526000","o":"C05A24"},{"b":"56314A526000","o":"C06777","s":"_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE"},{"b":"56314A526000","o":"C12FAA","s":"_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE"},{"b":"56314A526000","o":"C0E957","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"56314A526000","o":"C11D91"},{"b":"56314A526000","o":"1B13092","s":"_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE"},{"b":"56314A526000","o":"C0D7BF","s":"_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE"},{"b":"56314A526000","o":"C0FD05","s":"_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE"},{"b":"56314A526000","o":"C105FB","s":"_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE"},{"b":"56314A526000","o":"C0E9DD","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE"},{"b":"56314A526000","o":"C11D91"},{"b":"56314A526000","o":"1B135F5"},{"b":"56314A526000","o":"20A8194"},{"b":"7F840DEF4000","o":"7494"},{"b":"7F840DB55000","o":"E8ACF","s":"clone"}], "processInfo":{ "mongodbVersion" : "3.6.4", "gitVersion" : "d0181a711f7e7f39e60b5aeb1dc7097bf6ae5856", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.9.0-6-amd64", "version" : "#1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02)", "machine" : "x86_64" }, "somap" : [ { "b" : "56314A526000", "elfType" : 3, "buildId" : "9E8992AF64DDDA5CD452F1A1FFBB558210B8AD34" }, { "b" : "7FFD401B1000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "5B59EED0BE6765CA5F6BB78CBB0875891E340002" }, { "b" : "7F840EA38000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "713D47D5F599289C0A91ADE8F0122B2B4AA78B2E" }, { "b" : "7F840E830000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "5D83E0642E645026DBB11F89F7DF7106BD821495" }, { "b" : "7F840E62C000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "B895F0831F623C5F23603401D4069F9F94C24761" }, { "b" : "7F840E328000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1B95E3A8B8788B07E4F59EE69B1877F9DEB42033" }, { "b" : "7F840E111000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "51AD5FD294CD6C813BED40717347A53434B80B7A" }, { "b" : "7F840DEF4000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "4285CD3158DDE596765C747AE210AB6CBD258B22" }, { "b" : "7F840DB55000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" :"AA889E26A70F98FA8D230D088F7CC5BF43573163" }, { "b" : "7F840EC4F000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "263F909DBE11A66F7C6233E3FF0521148D9F8370" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x56314c704b71]
 mongod(+0x21DDD89) [0x56314c703d89]
 mongod(+0x21DE3F6) [0x56314c7043f6]
 libpthread.so.0(+0x110C0) [0x7f840df050c0]
 mongod(__wt_page_in_func+0x1804) [0x56314b003ad4]
 mongod(__wt_row_search+0x856) [0x56314b026ef6]
 mongod(__wt_btcur_insert+0x1027) [0x56314b097007]
 mongod(+0xB19E56) [0x56314b03fe56]
 mongod(_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEPKNS_9TimestampEm+0x346) [0x56314af55a56]
 mongod(_ZN5mongo21WiredTigerRecordStore13insertRecordsEPNS_16OperationContextEPSt6vectorINS_6RecordESaIS4_EEPS3_INS_9TimestampESaIS8_EEb+0x34) [0x56314af55ce4]
 mongod(_ZN5mongo14CollectionImpl16_insertDocumentsEPNS_16OperationContextEN9__gnu_cxx17__normal_iteratorIPKNS_15InsertStatementESt6vectorIS5_SaIS5_EEEESB_bPNS_7OpDebugE+0x374) [0x56314b217e34]
 mongod(_ZN5mongo14CollectionImpl15insertDocumentsEPNS_16OperationContextEN9__gnu_cxx17__normal_iteratorIPKNS_15InsertStatementESt6vectorIS5_SaIS5_EEEESB_PNS_7OpDebugEbb+0x164) [0x56314b218444]
 mongod(+0xCD6CDA) [0x56314b1fccda]
 mongod(_ZN5mongo14performInsertsEPNS_16OperationContextERKNS_9write_ops6InsertE+0xF75) [0x56314b2027b5]
 mongod(+0xCC2E9E) [0x56314b1e8e9e]
 mongod(+0xCBCB08) [0x56314b1e2b08]
 mongod(_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x1F) [0x56314c078a4f]
 mongod(+0xC040B1) [0x56314b12a0b1]
 mongod(+0xC05A24) [0x56314b12ba24]
 mongod(_ZN5mongo23ServiceEntryPointMongod13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x2B7) [0x56314b12c777]
 mongod(_ZN5mongo19ServiceStateMachine15_processMessageENS0_11ThreadGuardE+0xBA) [0x56314b138faa]
 mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x97) [0x56314b134957]
 mongod(+0xC11D91) [0x56314b137d91]
 mongod(_ZN5mongo9transport26ServiceExecutorSynchronous8scheduleESt8functionIFvvEENS0_15ServiceExecutor13ScheduleFlagsENS0_23ServiceExecutorTaskNameE+0x1A2) [0x56314c039092]
 mongod(_ZN5mongo19ServiceStateMachine22_scheduleNextWithGuardENS0_11ThreadGuardENS_9transport15ServiceExecutor13ScheduleFlagsENS2_23ServiceExecutorTaskNameENS0_9OwnershipE+0x15F) [0x56314b1337bf]
 mongod(_ZN5mongo19ServiceStateMachine15_sourceCallbackENS_6StatusE+0xAF5) [0x56314b135d05]
 mongod(_ZN5mongo19ServiceStateMachine14_sourceMessageENS0_11ThreadGuardE+0x23B) [0x56314b1365fb]
 mongod(_ZN5mongo19ServiceStateMachine15_runNextInGuardENS0_11ThreadGuardE+0x11D) [0x56314b1349dd]
 mongod(+0xC11D91) [0x56314b137d91]
 mongod(+0x1B135F5) [0x56314c0395f5]
 mongod(+0x20A8194) [0x56314c5ce194]
 libpthread.so.0(+0x7494) [0x7f840defb494]
 libc.so.6(clone+0x3F) [0x7f840dc3dacf]
-----  END BACKTRACE  -----

Comment by Joe [X] [ 08/Feb/18 ]

Another one.

I think my ulimit -n for that shard maybe too low. Could that cause something like this?

2018-02-08T18:44:52.607+0000 I COMMAND  [conn170] command national.unifiedtwo command: insert { insert: "unifiedtwo", bypassDocumentValidation: false, ordered: false, documents: 378, shardVersion: [ Timestamp(132, 5), ObjectId('5a77795701d95b39c6f6109d') ], writeConcern: { getLastError: 1, w: "majority" }, $clusterTime: { clusterTime: Timestamp(1518115492, 5), signature: { hash: BinData(0, 0E1BE3F2FC862B82C7067963C8080CAAFC8E53DD), keyId: 6514038582716399617 } }, $configServerState: { opTime: { ts: Timestamp(1518115488, 1), t: 3 } }, $db: "national" } ninserted:378 keysInserted:756 numYields:0 reslen:355 locks:{ Global: { acquireCount: { r: 12, w: 12 } }, Database: { acquireCount: { w: 12 } }, Collection: { acquireCount: { w: 6 } }, oplog: { acquireCount: { w: 6 } } } protocol:op_msg 198ms
2018-02-08T18:44:52.607+0000 I COMMAND  [conn80] command national.unifiedtwo command: insert { insert: "unifiedtwo", bypassDocumentValidation: false, ordered: false, documents: 416, shardVersion: [ Timestamp(132, 5), ObjectId('5a77795701d95b39c6f6109d') ], writeConcern: { getLastError: 1, w: "majority" }, $clusterTime: { clusterTime: Timestamp(1518115492, 5), signature: { hash: BinData(0, 0E1BE3F2FC862B82C7067963C8080CAAFC8E53DD), keyId: 6514038582716399617 } }, $configServerState: { opTime: { ts: Timestamp(1518115488, 1), t: 3 } }, $db: "national" } ninserted:416 keysInserted:832 numYields:0 reslen:355 locks:{ Global: { acquireCount: { r: 14, w: 14 } }, Database: { acquireCount: { w: 14 } }, Collection: { acquireCount: { w: 7 } }, oplog: { acquireCount: { w: 7 } } } protocol:op_msg 197ms
2018-02-08T18:44:52.677+0000 F -        [conn107] Got signal: 6 (Aborted).
 
 0x556079ffd0 0x556079f280 0x556079f758 0x7faebf16c0 0x7fae759528
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"555E90A000","o":"1E95FD0","s":"_ZN5mongo15printStackTraceERSo"},{"b":"555E90A000","o":"1E95280"},{"b":"555E90A000","o":"1E95758"},{"b":"7FAEBF1000","o":"6C0","s":"__kernel_rt_sigreturn"},{"b":"7FAE728000","o":"31528","s":"gsignal"}],"processInfo":{ "mongodbVersion" : "3.6.2", "gitVersion" : "489d177dbd0f0420a8ca04d39fd78d0a2c539420", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.103-rockchip-ayufan-175", "version" : "#1 SMP Thu Jan 11 16:10:41 UTC 2018", "machine" : "aarch64" }, "somap" : [ { "b" : "555E90A000", "elfType" : 3, "buildId" : "2216AD75BB3D68D98E9E344DE66B8DCA80788007" }, { "b" : "7FAEBF1000", "elfType" : 3, "buildId" : "46672A308C2699051AF475666C327086C6412CC0" }, { "b" : "7FAEBA1000", "path" : "/lib/aarch64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "ED7A2662858A4F87391BAE87D2D8BBBB7C406D54" }, { "b" : "7FAEB37000", "path" : "/lib/aarch64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FE1F68955FAF388EC7A72F8B9EAA75CCC0603B94" }, { "b" : "7FAE993000", "path" : "/lib/aarch64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "2AA55047FC1C4C87C0F3688195009316AE78E0AE" }, { "b" : "7FAE97C000", "path" : "/lib/aarch64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "FFAE838399016D9BEDEF5C85D772A7B1EF228127" }, { "b" : "7FAE969000", "path" : "/lib/aarch64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "A14FF51344A5D7E13CAC658D3D7863C7E5B6DDBD" }, { "b" : "7FAE8BC000", "path" : "/lib/aarch64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "78D5BDD3E860836F91FEAD99000A586662CA92D8" }, { "b" : "7FAE89B000", "path" : "/lib/aarch64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "A05222936FB1A047AC0567B0861730C781C69A23" }, { "b" : "7FAE86F000", "path" : "/lib/aarch64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "629C063061FD5A4EAA215BFDE0EC8C02F40893F0" }, { "b" : "7FAE728000", "path" : "/lib/aarch64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "5A4C72FADB14FE1891F8A6EA77275E1B7F52D820" }, { "b" : "7FAEBC6000", "path" : "/lib/ld-linux-aarch64.so.1", "elfType" : 3, "buildId" : "42369A5C3B1A57397BDA983454CDE519F7F8DEC5" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x48) [0x556079ffd0]
 mongod(+0x1E95280) [0x556079f280]
 mongod(+0x1E95758) [0x556079f758]
 (__kernel_rt_sigreturn+0x0) [0x7faebf16c0]
 libc.so.6(gsignal+0x38) [0x7fae759528]
-----  END BACKTRACE  -----

Comment by Joe [X] [ 07/Feb/18 ]

I think I ran something like this on each of the nodes:

"deb http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.6 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.6.list
sudo apt-get install mongodb-org=3.6.2 mongodb-org-server=3.6.2 mongodb-org-shell=3.6.2 mongodb-org-mongos=3.6.2 mongodb-org-tools=3.6.2

Although, I copied this out of the notes, so it might have been a bit different.

Comment by Andrew Morrow (Inactive) [ 07/Feb/18 ]

Weaver - Never mind about the aarch64 clarification, I see /lib/aarch64 in the stack trace.

Comment by Andrew Morrow (Inactive) [ 07/Feb/18 ]

Hi Weaver - Where did you obtain the binaries that you are running? From the MongoDb download site? Built from source? Something else? We have tried symbolizing the stack traces here, and they don't line up with what we would expect. Can you also confirm that you are running ARM64 binaries (aka aarch64).

Comment by Joe [X] [ 07/Feb/18 ]

I restarted the import process which led to another crash

Lots of move chunk messages:

2018-02-07T22:10:58.112+0000 I SHARDING [conn2] moveChunk data transfer progress: { active: true, sessionId: "newport_denver_5a7b7627736306be512afc1b", ns: "national.unifiedtwo", from: "newport/192.168.1.183:27017", min:

{ LastName: "ANDERSON" }

, max:

{ LastName: "ANGARANO" }

, shardKeyPattern:

{ LastName: 1.0 }

, state: "ready", counts:

{ cloned: 0, clonedBytes: 0, catchup: 0, steady: 0 }

, ok: 1.0, operationTime: Timestamp(1518041448, 25), $gleStats: { lastOpTime:

{ ts: Timestamp(1518040621, 24), t: 2 }

, electionId: ObjectId('7fffffff0000000000000002') }, $clusterTime: { clusterTime: Timestamp(1518041458, 90), signature:

{ hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 }

}, $configServerState: { opTime:

{ ts: Timestamp(1518041456, 1), t: 3 }

} } mem used: 19140 documents remaining to clone: 228903
2018-02-07T22:10:58.430+0000 F - [Collection Range Deleter] Got signal: 6 (Aborted).

0x557edaefd0 0x557edae280 0x557edae758 0x7fa17096c0 0x7fa1271528
----- BEGIN BACKTRACE -----
{"backtrace":[

{"b":"557CF19000","o":"1E95FD0","s":"_ZN5mongo15printStackTraceERSo"}

,

{"b":"557CF19000","o":"1E95280"}

,

{"b":"557CF19000","o":"1E95758"}

,

{"b":"7FA1709000","o":"6C0","s":"__kernel_rt_sigreturn"}

,

{"b":"7FA1240000","o":"31528","s":"gsignal"}

],"processInfo":{ "mongodbVersion" : "3.6.2", "gitVersion" : "489d177dbd0f0420a8ca04d39fd78d0a2c539420", "compiledModules" : [], "uname" :

{ "sysname" : "Linux", "release" : "4.4.103-rockchip-ayufan-175", "version" : "#1 SMP Thu Jan 11 16:10:41 UTC 2018", "machine" : "aarch64" }

, "somap" : [

{ "b" : "557CF19000", "elfType" : 3, "buildId" : "2216AD75BB3D68D98E9E344DE66B8DCA80788007" }

,

{ "b" : "7FA1709000", "elfType" : 3, "buildId" : "46672A308C2699051AF475666C327086C6412CC0" }

,

{ "b" : "7FA16B9000", "path" : "/lib/aarch64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "ED7A2662858A4F87391BAE87D2D8BBBB7C406D54" }

,

{ "b" : "7FA164F000", "path" : "/lib/aarch64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FE1F68955FAF388EC7A72F8B9EAA75CCC0603B94" }

,

{ "b" : "7FA14AB000", "path" : "/lib/aarch64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "2AA55047FC1C4C87C0F3688195009316AE78E0AE" }

,

{ "b" : "7FA1494000", "path" : "/lib/aarch64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "FFAE838399016D9BEDEF5C85D772A7B1EF228127" }

,

{ "b" : "7FA1481000", "path" : "/lib/aarch64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "A14FF51344A5D7E13CAC658D3D7863C7E5B6DDBD" }

,

{ "b" : "7FA13D4000", "path" : "/lib/aarch64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "78D5BDD3E860836F91FEAD99000A586662CA92D8" }

,

{ "b" : "7FA13B3000", "path" : "/lib/aarch64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "A05222936FB1A047AC0567B0861730C781C69A23" }

,

{ "b" : "7FA1387000", "path" : "/lib/aarch64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "629C063061FD5A4EAA215BFDE0EC8C02F40893F0" }

,

{ "b" : "7FA1240000", "path" : "/lib/aarch64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "5A4C72FADB14FE1891F8A6EA77275E1B7F52D820" }

,

{ "b" : "7FA16DE000", "path" : "/lib/ld-linux-aarch64.so.1", "elfType" : 3, "buildId" : "42369A5C3B1A57397BDA983454CDE519F7F8DEC5" }

] }}
mongod(_ZN5mongo15printStackTraceERSo+0x48) [0x557edaefd0]
mongod(+0x1E95280) [0x557edae280]
mongod(+0x1E95758) [0x557edae758]
(__kernel_rt_sigreturn+0x0) [0x7fa17096c0]
libc.so.6(gsignal+0x38) [0x7fa1271528]
----- END BACKTRACE -----

Comment by Joe [X] [ 07/Feb/18 ]

It was too big to upload, so I deleted some of the initial log file. Look at line 102,464

Comment by Mark Agarunov [ 07/Feb/18 ]

Hello Weaver,

Thank you for the report. To get a better idea of what the cause of this behavior may be, could you please provide the complete log files from the affected mongod node(s)?

Thanks,
Mark

Generated at Thu Feb 08 04:32:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.