-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.13
-
Component/s: None
-
Labels:None
-
ALL
Hi,
our MongoDB clusters crashed recently on two separate environments with the same error message. Please find details below:
2018-07-11T04:07:43.128+0000 E STORAGE [thread2] WiredTiger error (62) [1531282063:127984][2808:0x7f66d81a2700], file:WiredTiger.wt, WT_SESSION.checkpoint: /data/WiredTiger.turtle: handle-open: open: Timer expired 2018-07-11T04:07:43.128+0000 E STORAGE [thread2] WiredTiger error (62) [1531282063:128202][2808:0x7f66d81a2700], checkpoint-server: checkpoint server error: Timer expired 2018-07-11T04:07:43.128+0000 E STORAGE [thread2] WiredTiger error (-31804) [1531282063:128228][2808:0x7f66d81a2700], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic 2018-07-11T04:07:43.128+0000 I - [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361 2018-07-11T04:07:43.128+0000 I - [thread2] ***aborting after fassert() failure 2018-07-11T04:07:43.128+0000 I - [conn6891995] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 64 2018-07-11T04:07:43.128+0000 I - [conn6891995] ***aborting after fassert() failure 2018-07-11T04:07:43.147+0000 I - [WTJournalFlusher] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 64 2018-07-11T04:07:43.147+0000 I - [WTJournalFlusher] ***aborting after fassert() failure 2018-07-11T04:07:43.199+0000 F - [thread2] Got signal: 6 (Aborted). 0x5606d175b5b1 0x5606d175a7c9 0x5606d175acad 0x7f66de955690 0x7f66de5af277 0x7f66de5b0968 0x5606d09ff597 0x5606d146a316 0x5606d0a09c3e 0x5606d0a09e5a 0x5606d0a0a0bc 0x5606d20bc3b3 0x7f66de94dde5 0x7f66de677bad ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"5606D01E2000","o":"15795B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"5606D01E2000","o":"15787C9"},{"b":"5606D01E2000","o":"1578CAD"},{"b":"7F66DE946000","o":"F690"},{"b":"7F66DE579000","o":"36277","s":"gsignal"},{"b":"7F66DE579000","o":"37968","s":"abort"},{"b":"5606D01E2000","o":"81D597","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"5606D01E2000","o":"1288316"},{"b":"5606D01E2000","o":"827C3E","s":"__wt_eventv"},{"b":"5606D01E2000","o":"827E5A","s":"__wt_err"},{"b":"5606D01E2000","o":"8280BC","s":"__wt_panic"},{"b":"5606D01E2000","o":"1EDA3B3"},{"b":"7F66DE946000","o":"7DE5"},{"b":"7F66DE579000","o":"FEBAD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.13", "gitVersion" : "fbdef2ccc53e0fcc9afb570063633d992b2aae42", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.14.47-56.37.amzn1.x86_64", "version" : "#1 SMP Wed Jun 6 18:49:01 UTC 2018", "machine" : "x86_64" }, "somap" : [ { "b" : "5606D01E2000", "elfType" : 3, "buildId" : "0B8D59C7E131539CC482C89F0220B0866123E74F" }, { "b" : "7FFE87344000", "elfType" : 3, "buildId" : "2DF3D53B81C4CFBB2F14430578A041B78D5A1EE2" }, { "b" : "7F66DF8E4000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "9C4EB34A346260F2A77746F4E5ED837619137DB7" }, { "b" : "7F66DF486000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "EC480B38432587A9B21BFBD917EF020731EBD2CF" }, { "b" : "7F66DF27E000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "F2701E2A24459D5B55DF5549D585F091E7BCF07A" }, { "b" : "7F66DF07A000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "0E5CD5BAA5EE8BF3648A5031B088F9A78C89364F" }, { "b" : "7F66DED78000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "07FB92AFEF1756F093371CE60C3AE85DD3A06325" }, { "b" : "7F66DEB62000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A03C9A80E995ED5F43077AB754A258FA0E34C3CD" }, { "b" : "7F66DE946000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D973C39D1900DC61D8519C653C3BC405692DE563" }, { "b" : "7F66DE579000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "AF310F56618FC1EF9158973484F60942F11CC0FB" }, { "b" : "7F66DFB55000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8402047FD4A85B3CD1142346EA06BCD6E15A82A3" }, { "b" : "7F66DE32C000", "path" : "/usr/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "6FBBD34B86296FDF883FE5122017EC5CD3F98ED7" }, { "b" : "7F66DE044000", "path" : "/usr/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "76429E6FD408BBB675798D6458F2735383710D0B" }, { "b" : "7F66DDE41000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "5C01209C5AE1B1714F19B07EB58F2A1274B69DC8" }, { "b" : "7F66DDC0E000", "path" : "/usr/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "5B2A76F1EF91EDAA0494BE680CADAFE6489326E1" }, { "b" : "7F66DD9F8000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "89C6AF118B6B4FB6A73AE1813E2C8BDD722956D1" }, { "b" : "7F66DD7EA000", "path" : "/usr/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "3ACB59488C6D8DE0A1F4F1B0C290A570D9E42F3D" }, { "b" : "7F66DD5E7000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "37A58210FA50C91E09387765408A92909468D25B" }, { "b" : "7F66DD3CE000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "9E5E0BF5F22DE7555BC4B9853240817147489258" }, { "b" : "7F66DD1AD000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "F5054DC94443326819FBF3065CFDF5E4726F57EE" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5606d175b5b1] mongod(+0x15787C9) [0x5606d175a7c9] mongod(+0x1578CAD) [0x5606d175acad] libpthread.so.0(+0xF690) [0x7f66de955690] libc.so.6(gsignal+0x37) [0x7f66de5af277] libc.so.6(abort+0x148) [0x7f66de5b0968] mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x5606d09ff597] mongod(+0x1288316) [0x5606d146a316] mongod(__wt_eventv+0x3D7) [0x5606d0a09c3e] mongod(__wt_err+0x9D) [0x5606d0a09e5a] mongod(__wt_panic+0x2E) [0x5606d0a0a0bc] mongod(+0x1EDA3B3) [0x5606d20bc3b3] libpthread.so.0(+0x7DE5) [0x7f66de94dde5] libc.so.6(clone+0x6D) [0x7f66de677bad] ----- END BACKTRACE -----
We can see similar backtrace on other nodes.
Timeline:
ip-10-120-28-149: 02:18:18.276 [....] Got signal: 6 (Aborted) ip-10-120-28-115: 03:16:17.437 [....] Got signal: 6 (Aborted). ip-10-120-28-168: 04:07:43.199 [....] Got signal: 6 (Aborted).
Environment:
MongoDB cluster (3 nodes replica set) is running on AWS infrastructure.
[ec2-user@ip-10-120-28-149 ~]$ uname -a Linux ip-10-120-28-149 4.14.47-56.37.amzn1.x86_64 #1 SMP Wed Jun 6 18:49:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[ec2-user@ip-10-120-28-149 ~]$ mongo --version MongoDB shell version v3.4.13 git version: fbdef2ccc53e0fcc9afb570063633d992b2aae42 OpenSSL version: OpenSSL 1.0.0-fips 29 Mar 2010 allocator: tcmalloc modules: none build environment: distmod: amazon distarch: x86_64 target_arch: x86_64