[SERVER-42245] file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic Created: 16/Jul/19 Updated: 24/Jul/19 Resolved: 22/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Chengcheng Ma | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | wtc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Steps To Reproduce: | |
| Participants: |
| Description |
|
The problem was caused by power failure , and all the 3 nodes in the replica set crashed simultaneously. The mongodb version we are using is 3.4.2, running on CentOS release 6.5 (Final).
I have found a few issues of not being able to repair mongo data files, which were caused by corrupted WiredTiger.wt OR WiredTiger.turtle, and the reporters all uploaded the two files , then you will try to repair them and re-upload .
What I am encountering is the exact the same issue. So, here are the original WT files and also the repair output. Would you please help to fix this issue for me?
The below are the repair output: 2019-07-16T16:41:45.178+0800 I CONTROL [initandlisten] MongoDB starting : pid=2255 port=30000 dbpath=/iflytek/data/mongodb/new_data/data27017_bak/ 64-bit host=i-A3566A06 , repair: true, storage: { dbPath: "/iflytek/data/mongodb/new_data/data27017_bak/" } } ***aborting after fassert() failure 2019-07-16T16:41:45.413+0800 F - [initandlisten] Got signal: 6 (Aborted). 0x7fdc4edb8d21 0x7fdc4edb7e19 0x7fdc4edb82fd 0x7fdc4c508710 0x7fdc4c197925 0x7fdc4c199105 0x7fdc4e040c9d 0x7fdc4eac73e6 0x7fdc4e04af80 0x7fdc4e04b074 0x7fdc4e04b2cc 0x7fdc4f6bb1df 0x7fdc4f6bb72b 0x7fdc4f6b7e3d 0x7fdc4f6bc907 0x7fdc4f6da056 0x7fdc4f7106db 0x7fdc4f79d4fd 0x7fdc4f79dbf8 0x7fdc4f79e10c 0x7fdc4f7204c1 0x7fdc4f792f30 0x7fdc4f75d38e 0x7fdc4f75d47b 0x7fdc4f70cb2f 0x7fdc4eaaacec 0x7fdc4eaa3865 0x7fdc4e991fa7 0x7fdc4e02c559 0x7fdc4e04c6c4 0x7fdc4c183d1d 0x7fdc4e0aa371 ,{"b":"7FDC4D82F000","o":"1588E19"},{"b":"7FDC4D82F000","o":"15892FD"},{"b":"7FDC4C4F9000","o":"F710"},{"b":"7FDC4C165000","o":"32925","s":"gsignal"},{"b":"7FDC4C165000","o":"34105","s":"abort"},{"b":"7FDC4D82F000","o":"811C9D","s":"ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"7FDC4D82F000","o":"12983E6"},{"b":"7FDC4D82F000","o":"81BF80"},{"b":"7FDC4D82F000","o":"81C074","s":"wt_err"},{"b":"7FDC4D82F000","o":"81C2CC","s":"wt_panic"},{"b":"7FDC4D82F000","o":"1E8C1DF"},{"b":"7FDC4D82F000","o":"1E8C72B"},{"b":"7FDC4D82F000","o":"1E88E3D"},{"b":"7FDC4D82F000","o":"1E8D907"},{"b":"7FDC4D82F000","o":"1EAB056"},{"b":"7FDC4D82F000","o":"1EE16DB"},{"b":"7FDC4D82F000","o":"1F6E4FD"},{"b":"7FDC4D82F000","o":"1F6EBF8"},{"b":"7FDC4D82F000","o":"1F6F10C"},{"b":"7FDC4D82F000","o":"1EF14C1"},{"b":"7FDC4D82F000","o":"1F63F30"},{"b":"7FDC4D82F000","o":"1F2E38E"},{"b":"7FDC4D82F000","o":"1F2E47B"},{"b":"7FDC4D82F000","o":"1EDDB2F","s":"wiredtiger_open"},{"b":"7FDC4D82F000","o":"127BCEC","s":"_ZN5mongo18WiredTigerKVEngineC2ERKNSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_PNS_11ClockSourceES8_mbbbb"},{"b":"7FDC4D82F000","o":"1274865"},{"b":"7FDC4D82F000","o":"1162FA7","s":"_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv"},{"b":"7FDC4D82F000","o":"7FD559"},{"b":"7FDC4D82F000","o":"81D6C4","s":"main"},{"b":"7FDC4C165000","o":"1ED1D","s":"_libc_start_main"},{"b":"7FDC4D82F000","o":"87B371"}],"processInfo":{ "mongodbVersion" : "3.4.2", "gitVersion" : "3f76e40c105fc223b3e5aac3e20dcd026b83b38b", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.el6.x86_64", "version" : "#1 SMP Fri Nov 22 03:15:09 UTC 2013", "machine" : "x86_64" }, "somap" : [ { "b" : "7FDC4D82F000", "elfType" : 3, "buildId" : "0409C529A50A34D3E255B4350462A560B78F8892" }, { "b" : "7FFF2F9F0000", "elfType" : 3, "buildId" : "81A81BE2E44C93640ADEDB62ADC93A47F4A09DD1" }, { "b" : "7FDC4D3A1000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "BECFB85A8BC084042D5BF2BA9E66325CE798B659" }, { "b" : "7FDC4CFBC000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "CBDA444A7109874C5350AE9CEEF3F82F749B347F" }, { "b" : "7FAAB95B4000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "B26528BF6C0636AC1CAE5AC50BDBC07E60851DF4" }, { "b" : "7FAAB9FB0000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "AFC7448F2F2F6ED4E5BC82B1BD8A7320B84A9D48" }, { "b" : "7FAAB892C000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "98B028A725D6E93253F25DF00B794DFAA66A3145" }, { "b" : "7FA47BF16000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "EDC925E58FE28DCA536993EB13179C739F1E6566" }, { "b" : "7FAAB90F9000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "1BB4E10307D6B94223749CFDF2AD14C365972C60" }, { "b" : "7FAAB9165000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "F1A1C0575F0EC141A157E5DFA4525E70BD27B62E" }, { "b" : "7FAABAE0D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "57BF668F99B7F5917B8D55FBB645173C9A644575" }, { "b" : "7FA478B21000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "9A737F8BF10FC99C37CC404D3FC188F6E11FEDD9" }, { "b" : "7FA47983A000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "8D3D6E28DF6EB3752642A7031AAC17D39EA4265D" }, { "b" : "7FA47A636000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "7EC54D6E88BB7D2C1284117C2A483496A01EAAF4" }, { "b" : "7FA47900A000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "CC89B4C8CDCCD32BA610BC72784DC3B7E9BD9E19" }, { "b" : "7FDC4B5F2000", "path" : "/usr/local/lib/libz.so.1", "elfType" : 3, "buildId" : "F7DFD2C44B176B74A351A07FAEA54721D114FD95" }, { "b" : "7FA479BE7000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "E0C522C589F775C324330BE09CE67DC83950A213" }, { "b" : "7FA4795E4000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7FAAB63CA000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "56843351EFB2CE304A7E4BD0754991613E9EC8BD" }, { "b" : "7FA47A9AB000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "B4576BE308DDCF7BC31F7304E4734C3D846D0236" } ] }}
PS: Since this kind of problems had occurred so many time, have you considered telling us how to correct the corrupted file by ourselves, not just fix the files problems for us?
|
| Comments |
| Comment by Chengcheng Ma [ 24/Jul/19 ] |
|
So many thanks, Kelsey Schubert. By using the files you attached, we have dumped most of the data from .wt data file directly, through the utility of wt. Now we have reloaded the dumped data into a monogd instance. Thanks again. |
| Comment by Kelsey Schubert [ 22/Jul/19 ] |
|
Hi cora_ma, This error indicates additional corruption on disk affecting other files, and I'd strongly recommend verifying the integrity of your disks. Unfortunately, we do not have any scripts that would repair these files. Kind regards, |
| Comment by Chengcheng Ma [ 19/Jul/19 ] |
|
@Kelsey Schubert Would please help? |
| Comment by Chengcheng Ma [ 18/Jul/19 ] |
|
Sorry to not upload the log files which mentioned in the 1st answer. The file named mongod.log.2019-07-11T10-22-54 is the log when tried to start after crash. mongod.log.2019-07-11T10-22-54 mongo_repair.log is the repair log which replaced with your uploaded files. |
| Comment by Chengcheng Ma [ 17/Jul/19 ] |
|
@Kelsey Schubert, unfortunately we were not able to recover the dead mongod. As you mentioned, here are the corresponding information which you suspected:
Because all the 3 nodes in the replica set have the same situation, so we do not have any unaffected node to resync.
Thanks a lot in advance. |
| Comment by Kelsey Schubert [ 17/Jul/19 ] |
|
cora_ma, this error message leads us to suspect some form of physical corruption.
The ideal resolution is to perform a clean resync from an unaffected node. If that is not possible, I've attached a repair attempt of the files you provided as repair_attempt.tar.gz Kind regards, |
| Comment by Chengcheng Ma [ 17/Jul/19 ] |
|
Does any one help? |