[SERVER-32378] Mongodb failed to start after a disk 100% full problem Created: 17/Dec/17  Updated: 27/Jul/18  Resolved: 19/Dec/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: wang yongchao Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: envns, rdi, rpu, trcf, wtc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     Text File mongodb-2017-12-19.log     Text File mongodb.log     File repair-SERVER-32378.tar.gz    
Operating System: Linux
Steps To Reproduce:

2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] MongoDB starting : pid=548 port=27017 dbpath=/data/mongodb 64-bit host=localhost.localdomain
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] db version v3.2.7
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] git version: 4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] allocator: tcmalloc
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] modules: none
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] build environment:
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten]     distmod: rhel62
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten]     distarch: x86_64
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten]     target_arch: x86_64
2017-12-17T08:24:15.390-0500 I CONTROL  [initandlisten] options: { config: "/etc/mongodb.conf", net: { port: 27017 }, processManagement: { fork: true }, security: { authorization: "enabled" }, storage: { dbPath: "/data/mongodb" }, systemLog: { destination: "file", logAppend: true, path: "/data/mongodb/log/mongodb.log" } }
2017-12-17T08:24:15.422-0500 I -        [initandlisten] Detected data files in /data/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-12-17T08:24:15.422-0500 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-12-17T08:24:15.436-0500 E STORAGE  [initandlisten] WiredTiger (0) [1513517055:436011][548:0x7f9ba83f3d40], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 24576: block header checksum of 1634628197 doesn't match expected checksum of 2995010700
2017-12-17T08:24:15.436-0500 E STORAGE  [initandlisten] WiredTiger (0) [1513517055:436063][548:0x7f9ba83f3d40], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2017-12-17T08:24:15.436-0500 E STORAGE  [initandlisten] WiredTiger (-31804) [1513517055:436079][548:0x7f9ba83f3d40], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2017-12-17T08:24:15.436-0500 I -        [initandlisten] Fatal Assertion 28558
2017-12-17T08:24:15.436-0500 I -        [initandlisten]
 
***aborting after fassert() failure
 
 
2017-12-17T08:24:15.460-0500 F -        [initandlisten] Got signal: 6 (Aborted).
 
 0x1337cb2 0x1336be9 0x13373f2 0x7f9ba70e3710 0x7f9ba6d72625 0x7f9ba6d73e05 0x12bea52 0x10af8d3 0x1a9f62c 0x1a9faed 0x1a9fed4 0x19d18fc 0x19d1e73 0x19cee15 0x19d2eb9 0x19ef92c 0x1a257c0 0x1a9e357 0x1a9e899 0x1a9e9bb 0x1a33db8 0x1a9b515 0x1a660ef 0x1a661ee 0x1a223a1 0x1097517 0x1093893 0xfbad38 0x9970dd 0x99acbd 0x7f9ba6d5ed5d 0x9937e9
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"F37CB2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F36BE9"},{"b":"400000","o":"F373F2"},{"b":"7F9BA70D4000","o":"F710"},{"b":"7F9BA6D40000","o":"32625","s":"gsignal"},{"b":"7F9BA6D40000","o":"33E05","s":"abort"},{"b":"400000","o":"EBEA52","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"CAF8D3"},{"b":"400000","o":"169F62C","s":"__wt_eventv"},{"b":"400000","o":"169FAED","s":"__wt_err"},{"b":"400000","o":"169FED4","s":"__wt_panic"},{"b":"400000","o":"15D18FC","s":"__wt_block_extlist_read"},{"b":"400000","o":"15D1E73","s":"__wt_block_extlist_read_avail"},{"b":"400000","o":"15CEE15","s":"__wt_block_checkpoint_load"},{"b":"400000","o":"15D2EB9"},{"b":"400000","o":"15EF92C","s":"__wt_btree_open"},{"b":"400000","o":"16257C0","s":"__wt_conn_btree_open"},{"b":"400000","o":"169E357","s":"__wt_session_get_btree"},{"b":"400000","o":"169E899","s":"__wt_session_get_btree"},{"b":"400000","o":"169E9BB","s":"__wt_session_get_btree_ckpt"},{"b":"400000","o":"1633DB8","s":"__wt_curfile_open"},{"b":"400000","o":"169B515"},{"b":"400000","o":"16660EF","s":"__wt_metadata_cursor_open"},{"b":"400000","o":"16661EE","s":"__wt_metadata_cursor"},{"b":"400000","o":"16223A1","s":"wiredtiger_open"},{"b":"400000","o":"C97517","s":"_ZN5mongo18WiredTigerKVEngineC2ERKSsS2_S2_mbbb"},{"b":"400000","o":"C93893"},{"b":"400000","o":"BBAD38","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"400000","o":"5970DD","s":"_ZN5mongo13initAndListenEi"},{"b":"400000","o":"59ACBD","s":"main"},{"b":"7F9BA6D40000","o":"1ED5D","s":"__libc_start_main"},{"b":"400000","o":"5937E9"}],"processInfo":{ "mongodbVersion" : "3.2.7", "gitVersion" : "4249c1d2b5999ebbf1fdf3bc0e0e3b3ff5c0aaf2", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "2.6.32-042stab123.2", "version" : "#1 SMP Mon Apr 17 17:27:00 MSK 2017", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "D6BB40ACB6CFEAA43C3A651DDD8E547FCFE19166" }, { "b" : "7FFC9A223000", "elfType" : 3, "buildId" : "533CF5AD95A05EFC8DF9CE63034292313B26B1DD" }, { "b" : "7F9BA7F7C000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "BECFB85A8BC084042D5BF2BA9E66325CE798B659" }, { "b" : "7F9BA7B97000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "CBDA444A7109874C5350AE9CEEF3F82F749B347F" }, { "b" : "7F9BA798F000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "583411D8786F86A1D6B8741C502831E6122445A7" }, { "b" : "7F9BA778B000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "454F8FC6CC6502C6401E5F9E221564D80665D277" }, { "b" : "7F9BA7507000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "7D8E9374F4A4EA38A7C1E763F32240EA113E4208" }, { "b" : "7F9BA72F1000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "BC7550A8A7C2D706FE4E489058BADC963465DBB7" }, { "b" : "7F9BA70D4000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "B8DFF8E53D9F2B80C3C382E83EC17C828B536A39" }, { "b" : "7F9BA6D40000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "C78F2F16F8C2F9DB39CFCE348794BD92CA56499D" }, { "b" : "7F9BA81E8000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "6F8E59B70E469F3A924A268911FF8FD0C37E7460" }, { "b" : "7F9BA6AFC000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "9B852585C66329AA02EFB28497E652A40F538E78" }, { "b" : "7F9BA6815000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "EF3AACAFD6BF71BB861F194C1559153FB0B020E2" }, { "b" : "7F9BA6611000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "152E2C18A7A2145021A8A879A01A82EE134E3946" }, { "b" : "7F9BA63E5000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "DDE6774979156442185836150FC0785170F8001F" }, { "b" : "7F9BA61CF000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7F9BA5FC4000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "A23DAFBCE170763BF1E836A8B26113F9CD20E0DA" }, { "b" : "7F9BA5DC1000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7F9BA5BA7000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "F8B68F301C19BF06AF56B4B06E0A69F89D2C1F8D" }, { "b" : "7F9BA5988000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "B4576BE308DDCF7BC31F7304E4734C3D846D0236" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1337cb2]
 mongod(+0xF36BE9) [0x1336be9]
 mongod(+0xF373F2) [0x13373f2]
 libpthread.so.0(+0xF710) [0x7f9ba70e3710]
 libc.so.6(gsignal+0x35) [0x7f9ba6d72625]
 libc.so.6(abort+0x175) [0x7f9ba6d73e05]
 mongod(_ZN5mongo13fassertFailedEi+0x82) [0x12bea52]
 mongod(+0xCAF8D3) [0x10af8d3]
 mongod(__wt_eventv+0x42C) [0x1a9f62c]
 mongod(__wt_err+0x8D) [0x1a9faed]
 mongod(__wt_panic+0x24) [0x1a9fed4]
 mongod(__wt_block_extlist_read+0x6C) [0x19d18fc]
 mongod(__wt_block_extlist_read_avail+0x33) [0x19d1e73]
 mongod(__wt_block_checkpoint_load+0x3C5) [0x19cee15]
 mongod(+0x15D2EB9) [0x19d2eb9]
 mongod(__wt_btree_open+0xC7C) [0x19ef92c]
 mongod(__wt_conn_btree_open+0x140) [0x1a257c0]
 mongod(__wt_session_get_btree+0xE7) [0x1a9e357]
 mongod(__wt_session_get_btree+0x629) [0x1a9e899]
 mongod(__wt_session_get_btree_ckpt+0xAB) [0x1a9e9bb]
 mongod(__wt_curfile_open+0x218) [0x1a33db8]
 mongod(+0x169B515) [0x1a9b515]
 mongod(__wt_metadata_cursor_open+0x5F) [0x1a660ef]
 mongod(__wt_metadata_cursor+0x7E) [0x1a661ee]

Participants:

 Description   

My server had a 'No space left on device' problem because our disk was running out of inode early today.
But after we fixed the disk problem, our mongodb was still unable to start or repair.
I desperately need your help : (
And the log information as follow:



 Comments   
Comment by Mark Agarunov [ 19/Dec/17 ]

Hello aquilx,

Unfortunately, this error indicates that there was corruption on the disk, most often cause by a faulty storage layer. In this situation, our best recommendation would be to resync the affected node or restore from a backup if possible.

To prevent this type of problem in the future please take note of the following guidelines to help mitigate any issues related to unreliable storage layers or server failures.

Thanks,
Mark

Comment by wang yongchao [ 19/Dec/17 ]

Hello Mark,
Thank you help , after replaced these files still had some error but was differently, Please help me try again,
I'm already dropped new log file "mongodb-2017-12-19.log".

Thanks,
yongchao

Comment by Mark Agarunov [ 18/Dec/17 ]

Hello aquilx,

Thank you for your report. I've attached a repair attempt of the files you provided. Please extract these files and replace them in your $dbpath and let us know if it resolves the issue. If you are still seeing errors after replacing these files, please provide the complete logs from the affected node(s) so that we can further investigate.

Thanks,
Mark

Generated at Thu Feb 08 04:30:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.