[SERVER-49551] Improve salvage functionality Created: 16/Jul/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 4.2.2
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Or Shtar Assignee: Backlog - Storage Engines Team
Resolution: Unresolved Votes: 0
Labels: FA_50947
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive SERVER-49551_repair_attempt.zip    
Assigned Teams:
Storage Engines
Participants:
Case:

 Description   

Hi!

There was an unexpectedly reboot on the server. And I don't have backups.

Tried the --repair option, but it doesn't help.

What should I do?

 

Failed to salvage WiredTiger metadata: -31809: WT_TRY_SALVAGE: database corruption detected
2020-07-16T17:44:52.484+0200 F - [initandlisten] Fatal Assertion 50947 at src\mongo\db\storage\wiredtiger\wiredtiger_kv_engine.cpp 804
2020-07-16T17:44:52.485+0200 F - [initandlisten]

***aborting after fassert() failure



 Comments   
Comment by Sadaf Siddiqui [ 27/Sep/21 ]

i am having the same issue, could you please help?

Comment by Or Shtar [ 28/Jul/20 ]

Got it, thank you.
Or.

Comment by Louis Williams [ 28/Jul/20 ]

orshtar30@gmail.com, the problem is that the storage devices you are using do not appear to be reliable in the event of an unexpected shutdown. All of Dmity's suggestions are ways you can introduce redundancy to avoid the single point of failure that is your storage device.

Comment by Or Shtar [ 22/Jul/20 ]

These are all ways to keep the data safe, right? the problem in the wiredTiger files would have happened anyway because of the shutdown?

Comment by Dmitry Agranat [ 22/Jul/20 ]

Hi orshtar30@gmail.com,

We are very glad to hear this.

It is hard to tell what was the exact cause of this issue but there are several ways to avoid this in the future.

To avoid a problem like this in the future, it is our strong recommendation to:

Regards,
Dima

Comment by Or Shtar [ 21/Jul/20 ]

It's working! Thank you so much!

I would be happy to know what was the exact problem. also, what do you recommend me to do in order to prevent this from happening again and to protect my data better. I'm very new to this so it would help a lot.

Again, thank you! 

Or.

Comment by Dmitry Agranat [ 21/Jul/20 ]

Hi orshtar30@gmail.com,

I've attached a repair attempt of the files you provided as SERVER-49551_repair_attempt.zip. Please extract these files, replace them in your $dbpath, and let us know if it resolves the issue.

What do you mean by it may not work? we cannot salvage the data or the whole mongoDB service won't work and we need to re-install it?

I assume we'll know the answer to that after trying the repair attempt.

Thanks,
Dima

Comment by Or Shtar [ 21/Jul/20 ]

Hi Dima,

I uploaded the files.

What do you mean by it may not work? we cannot salvage the data or the whole mongoDB service won't work and we need to re-install it?

Thanks,

Or

Comment by Dmitry Agranat [ 21/Jul/20 ]

Hi orshtar30@gmail.com,

Please attach copies of the wiredTiger.wt and wiredTiger.turtle files and we can attempt a metadata-only repair effort using internal tools.

Keep in mind that this repair effort may not be successful, and that diagnosing corruption issues requires significant information and effort.

Thanks,
Dima

Comment by Or Shtar [ 19/Jul/20 ]

Hi,

I uploaded the log before repair, it shows the attempts of the mongo service to restart after the server shut down. 

The repair operation was not completed - the error I added above rose during the repair. I added a file "repair.txt" of the repair operation output to the cmd.

I also added "journal.zip" of the WiredTiger logs.

The server is not part of a replica set. I also have a copy of the data folder before the repair operation (after the shut down) if it is any good.

edit: added a full log before the repair function (its very long).

Comment by Dmitry Agranat [ 19/Jul/20 ]

Hi orshtar30@gmail.com,

Would you please provide the following information?

  • The full logs of the initial failure (prior to repair attempt)
  • The full logs of the repair operation.
  • The full logs of any attempt to start mongod after the repair operation completed.

Is this server a part of a replica set?

I've created a secure upload portal for you. Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Thanks,
Dima

Generated at Thu Feb 08 05:20:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.