[SERVER-36970] Mongodb failed to start after a disk 100% full problem Created: 31/Aug/18 Updated: 04/Sep/18 Resolved: 04/Sep/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.2.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Xing Fang | Assignee: | Nick Brewer |
| Resolution: | Done | Votes: | 0 |
| Labels: | envm, rfi, rps, trcf, wtc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
I encountered a similar problem as this post: https://jira.mongodb.org/browse/SERVER-26237 and want to know if anybody can help on that. The mongodb starting log is listed: 2018-08-31T11:32:42.803+0800 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { port: 27017 }, replication: { oplogSizeMB: 1024, replSetName: "workaioplog" }, security: { authorization: "enabled" }, storage: { dbPath: "/var/lib/mongodb", journal: { enabled: false } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } } ***aborting after fassert() failure Can anyone help to generate the repair file for me? I attached the files probably you need in the following:
|
| Comments |
| Comment by Nick Brewer [ 04/Sep/18 ] |
|
lorrie Glad to hear it worked! Due to the risk inherent in manipulating WiredTiger files, we don't publicly offer the tool that is used to perform this particular repair. Some considerations to help mitigate issues caused by unreliable storage layers or server failures:
-Nick |
| Comment by Xing Fang [ 04/Sep/18 ] |
|
Nick, thank you very much! It works! This looks like rocket science! You are a magician! BTW, can you share more information for how to repair it? |
| Comment by Nick Brewer [ 04/Sep/18 ] |
|
lorrie I've uploaded the files after a repair attempt - could you substitute them for the files currently in your dbpath, and let us know if this resolves the issue? I should note that we do not recommend manipulating .wt files with the wt binary directly - if you have a backup of the dbpath before you began making changes to it in this way, I would recommend using that backup when you attempt to start the mongod with the files I've provided. -Nick |
| Comment by Xing Fang [ 31/Aug/18 ] |
|
Updates:
I installed the wt tool following this link : "http://www.alexbevi.com/blog/2016/02/10/recovering-a-wiredtiger-collection-from-a-corrupt-mongodb-installation/?spm=a2c4e.11153940.blogcont73203.10.6ee15112TPYryA"
and run cmd: ./wt -v -h ../mongo-bak -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R salvage WiredTiger.wt
but got err:
1535753498:720382][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 61440: block header checksum of 3027734606 doesn't match expected checksum of 2157011047 [1535753498:720437][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value [1535753498:720468][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic lt-wt: WT_PANIC: WiredTiger library panic
Looks like it cannot repair this damaged WiredTiger.wt file.
|
| Comment by Xing Fang [ 31/Aug/18 ] |
|
BTW, actually I m not sure if WireTiger.wt is the damaged wt file for repairing. From my starting log it shows "WiredTiger.wt", however, when I googled the similar issue, the log always provide a wt file which format is like "collection-xxx–xxxx.wt", which I have a bunch under my mongodb folder. Do you have any clues? Thanks! |
| Comment by Xing Fang [ 31/Aug/18 ] |
|
Nick: Thank you very much!
Ans: it is an Ubuntu 14.04.4 x86_64 LTS
Ans: It is a Aliyun cloud ECS server (like an AWS EC2) |
| Comment by Nick Brewer [ 31/Aug/18 ] |
|
lorrie I can attempt to repair these files - however beforehand I'd need to confirm:
Thanks, |