[SERVER-36970] Mongodb failed to start after a disk 100% full problem Created: 31/Aug/18  Updated: 04/Sep/18  Resolved: 04/Sep/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Xing Fang Assignee: Nick Brewer
Resolution: Done Votes: 0
Labels: envm, rfi, rps, trcf, wtc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File logfile.tar.gz     File repair-attempt-36970.tar.gz    
Operating System: Linux
Participants:

 Description   

I encountered a similar problem as this post:

https://jira.mongodb.org/browse/SERVER-26237

and want to know if anybody can help on that. 

The mongodb starting log is listed:

2018-08-31T11:32:42.803+0800 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net:

{ port: 27017 }

, replication: { oplogSizeMB: 1024, replSetName: "workaioplog" }, security: { authorization: "enabled" }, storage: { dbPath: "/var/lib/mongodb", journal:

{ enabled: false }

}, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
2018-08-31T11:32:42.835+0800 I - [initandlisten] Detected data files in /var/lib/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2018-08-31T11:32:42.848+0800 I STORAGE [initandlisten] Detected WT journal files. Running recovery from last checkpoint.
2018-08-31T11:32:42.848+0800 I STORAGE [initandlisten] journal to nojournal transition config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2018-08-31T11:32:42.885+0800 E STORAGE [initandlisten] WiredTiger (0) [1535686362:885029][899:0x7f49f1f32cc0], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 61440: block header checksum of 3027734606 doesn't match expected checksum of 2157011047
2018-08-31T11:32:42.885+0800 E STORAGE [initandlisten] WiredTiger (0) [1535686362:885098][899:0x7f49f1f32cc0], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2018-08-31T11:32:42.885+0800 E STORAGE [initandlisten] WiredTiger (-31804) [1535686362:885129][899:0x7f49f1f32cc0], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2018-08-31T11:32:42.885+0800 I - [initandlisten] Fatal Assertion 28558
2018-08-31T11:32:42.885+0800 I - [initandlisten]

***aborting after fassert() failure

Can anyone help to generate the repair file for me? I attached the files probably you need in the following:

 

 



 Comments   
Comment by Nick Brewer [ 04/Sep/18 ]

lorrie Glad to hear it worked! Due to the risk inherent in manipulating WiredTiger files, we don't publicly offer the tool that is used to perform this particular repair.

Some considerations to help mitigate issues caused by unreliable storage layers or server failures: 

 

-Nick

Comment by Xing Fang [ 04/Sep/18 ]

Nick,  thank you very much! It works! This looks like rocket science! You are a magician!

BTW, can you share more information for how to repair it?

Comment by Nick Brewer [ 04/Sep/18 ]

lorrie I've uploaded the files after a repair attempt - could you substitute them for the files currently in your dbpath, and let us know if this resolves the issue?

I should note that we do not recommend manipulating .wt files with the wt binary directly - if you have a backup of the dbpath before you began making changes to it in this way, I would recommend using that backup when you attempt to start the mongod with the files I've provided.

-Nick

repair-attempt-36970.tar.gz

Comment by Xing Fang [ 31/Aug/18 ]

Updates:

 

I installed the wt tool following this link : "http://www.alexbevi.com/blog/2016/02/10/recovering-a-wiredtiger-collection-from-a-corrupt-mongodb-installation/?spm=a2c4e.11153940.blogcont73203.10.6ee15112TPYryA"

 

and run cmd:

./wt -v -h ../mongo-bak -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R salvage WiredTiger.wt

 

but got err:

 

1535753498:720382][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 61440: block header checksum of 3027734606 doesn't match expected checksum of 2157011047

[1535753498:720437][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value

[1535753498:720468][28452:0x7f4f9ada3740], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic

lt-wt: WT_PANIC: WiredTiger library panic

 

Looks like it cannot repair this damaged WiredTiger.wt file.

 

 

Comment by Xing Fang [ 31/Aug/18 ]

BTW, actually I m not sure if WireTiger.wt is the damaged wt file for repairing.

From my starting log it shows "WiredTiger.wt", however, when I googled the similar issue, the log always provide a wt file which format is like "collection-xxx–xxxx.wt", which I have a bunch under my mongodb folder.

Do you have any clues?

Thanks!

Comment by Xing Fang [ 31/Aug/18 ]

Nick:

Thank you very much!

  • What operating system is MongoDB running on?

Ans: it is an Ubuntu 14.04.4 x86_64 LTS

  • What is the environment (virtual machine, container, etc)?

Ans: It is a Aliyun cloud ECS server (like an AWS EC2)

Comment by Nick Brewer [ 31/Aug/18 ]

lorrie I can attempt to repair these files - however beforehand I'd need to confirm:

  • What operating system is MongoDB running on?
  • What is the environment (virtual machine, container, etc)?

Thanks,
Nick

Generated at Thu Feb 08 04:44:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.