[SERVER-32179] Data recovery Created: 05/Dec/17  Updated: 07/Jan/18  Resolved: 07/Dec/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.10
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Pawel Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive WiredTiger-files.zip     File repair-SERVER-32179.tar.gz    
Participants:

 Description   

Hi,

We run MongoDB 3.2.10 in an OpenShift cluster. Data for one of databases is stored on NFS share which run out of space resulting in data corruption. WiredTiger.turtle is empty and I'm unable to start the server or run automated recovery. I've heard about manual recovery options. Are they documented somewhere?

Thanks,
Pawel



 Comments   
Comment by Mark Agarunov [ 07/Dec/17 ]

Hello pkar,

Unfortunately, this error indicates that there was corruption on the disk. In this situation, my best recommendation would be to resync the affected node or restore from a backup if possible.

Thanks,
Mark

Comment by Pawel [ 07/Dec/17 ]

Hi Mark,

This didn't help. Log trace:
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] MongoDB starting : pid=30 port=27017 dbpath=/var/lib/mongodb/data 64-bit host=db-debug
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] db version v3.2.10
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] allocator: tcmalloc
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] modules: none
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] build environment:
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] distarch: x86_64
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] target_arch: x86_64
2017-12-07T18:18:29.359+0000 I CONTROL [initandlisten] options: { storage:

{ dbPath: "/var/lib/mongodb/data" }

, systemLog:

{ destination: "file", path: "/var/lib/mongodb/data/log.txt", verbosity: 6 }

}
2017-12-07T18:18:29.359+0000 D NETWORK [initandlisten] fd limit hard:1048576 soft:1048576 max conn: 838860
2017-12-07T18:18:29.364+0000 I - [initandlisten] Detected data files in /var/lib/mongodb/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-12-07T18:18:29.366+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-12-07T18:18:29.377+0000 I - [initandlisten] Assertion: 28595:2: No such file or directory
2017-12-07T18:18:29.377+0000 I STORAGE [initandlisten] exception in initAndListen: 28595 2: No such file or directory, terminating
2017-12-07T18:18:29.377+0000 I CONTROL [initandlisten] dbexit: rc: 100

Answers to your questions:
1. NFS storage
2. n/a
3. always this version of MongoDB
4. No
5. Yes. It was running ok afterwards
6. mongodump
7. it's clean

Thanks,
Pawel

Comment by Mark Agarunov [ 07/Dec/17 ]

Hello pkar,

Thank you for providing these files. I've attached a repair attempt of the files you've provided. Would you please extract these files and replace them in your $dbpath and let us know if it resolves the issue? If you are still seeing errors after replacing these files, please provide the complete logs from mongod so that we can further investigate. Additionally, if this issue persists, please provide the following information:

  1. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using?
  2. Would you please check the integrity of your disks?
  3. Has the database always been running this version of MongoDB? If not please describe the upgrade/downgrade cycles the database has been through.
  4. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running?
  5. Have you ever restored this instance from backups?
  6. What method do you use to create backups?
  7. When was the underlying filesystem last checked and is it currently marked clean?

Thanks,
Mark

Comment by Pawel [ 07/Dec/17 ]

Hi Mark,

Files are attached.

Thanks!
Pawel

Comment by Mark Agarunov [ 06/Dec/17 ]

Hello pkar,

Thank you for the report. If you can provide the WiredTiger.wt and WiredTiger.turtle files we can attempt a repair of the database, but please keep in mind that this is not a guaranteed fix.

Thanks,
Mark

Generated at Thu Feb 08 04:29:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.