[SERVER-38406] collection filename.wt does not appear to be a WiredTiger file Created: 05/Dec/18  Updated: 01/Feb/19  Resolved: 01/Feb/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.11
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Den Fomichev Assignee: Danny Hatcher (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File WiredTiger.turtle     File WiredTiger.wt     File repair_attempt.tar    
Participants:

 Description   

after an VM clear shutdown we cannot start back mongodb with this error in log:

2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (-31802) [1543869878:580798][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: Recovery failed: WT_ERROR: non-specific WiredTiger error
2018-12-03T20:44:37.130+0000 I STORAGE  [initandlisten] Detected WT journal files.  Running recovery from last checkpoint.
2018-12-03T20:44:37.130+0000 I STORAGE  [initandlisten] journal to nojournal transition config: create,cache_size=6G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (-31802) [1543869878:580654][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: collection-2--2481999773037646866.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (-31802) [1543869878:580739][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: operation apply failed during recovery: operation type 4 at LSN 59643/86453504: WT_ERROR: non-specific WiredTiger error
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (0) [1543869878:580755][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: WiredTiger is unable to read the recovery log.
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (0) [1543869878:580769][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: This may be due to the log files being encrypted, being from an older version or due to corruption on disk
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (0) [1543869878:580780][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: You should confirm that you have opened the database with the correct options including all encryption and compression options
2018-12-03T20:44:38.580+0000 E STORAGE  [initandlisten] WiredTiger (-31802) [1543869878:580798][2363:0x7f49e439acc0], file:collection-2--2481999773037646866.wt, txn-recover: Recovery failed: WT_ERROR: non-specific WiredTiger error
2018-12-03T20:44:38.643+0000 I -        [initandlisten] Assertion: 28718:-31802: WT_ERROR: non-specific WiredTiger error
2018-12-03T20:44:38.643+0000 I STORAGE  [initandlisten] exception in initAndListen: 28718 -31802: WT_ERROR: non-specific WiredTiger error, terminating
2018-12-03T20:44:38.643+0000 I CONTROL  [initandlisten] dbexit:  rc: 100

The outher ways to get data from this collection file and move them to outher mongo instance facing the same error:

We try:
1) mongod --repair (same error in logs)

2) wt tool specially build for our wiretiger version (http://source.wiredtiger.com/2.9.2/command_line.html#util_salvage)

./wt -v -h /storage1/restore_checkArchive -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]"  -R verify collection-2--2481999773037646866

[1544005066:138304][28344:0x7fe82dc5c300], file:collection-2--2481999773037646866.wt, WT_SESSION.verify: collection-2--2481999773037646866.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error

./wt -v -h /storage1/restore_checkArchive -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]"  -R salvage collection-2--2481999773037646866.wt

[1544005420:773648][29867:0x7f8a8dbe0300], file:collection-2--2481999773037646866.wt, WT_SESSION.salvage: collection-2--2481999773037646866.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
wt: session.salvage: file:collection-2--2481999773037646866.wt: WT_ERROR: non-specific WiredTiger error

With the help of wt utility (salvage -> dump -> load) we recover some outher collection files to use in outher mongo instance but not this one

Is there some outher way to fix this or just extract data from it?

What additional data (files or so) you need to clarify the problem (The size of problem collection filesis close to 200Gb.)?



 Comments   
Comment by Danny Hatcher (Inactive) [ 06/Dec/18 ]

Hello Den,

I'm sorry, I should have been more clear. Unfortunately, it is likely that the file is unrecoverable. You may be able to start the node again with by replacing the WiredTiger.wt and WiredTiger.turtle files with repair_attempt.tar and then deleting the collection file but the data within that collection will not exist in the new process.

In this situation, our best recommendation would be to resync the affected node or restore from a backup if possible.

To prevent this type of problem in the future please take note of the following guidelines to help mitigate any issues related to unreliable storage layers or server failures.

  • Make sure your underlying storage is configured in an optimal way.
  • Schedule and perform regular checks of the integrity of your filesystems and disks.
  • Make sure to update MongoDB to the most recent version. Please note that MongoDB 3.2 has reached end-of-life status so we recommend upgrading to at least MongoDB 3.4 when possible.
  • Never manipulate the underlying database files in any way while mongod is running.
  • Always keep up to date backups of your databases and verify that you have a process in place to restore them.
  • Use a replica set for improved reliability. This is strongly recommended for a Production system.

For further MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-user group. Users there may have an unofficial method of recovering the data.

Thank you,

Danny

Comment by Den Fomichev [ 06/Dec/18 ]

> If you attach your WiredTiger.wt and WiredTiger.turtle files to this ticket, I can attempt a recovery using an internal tool.

Requested files are added to the ticket. You need full collection file to attempt a recovery (~200Gb) yourself or You will supply us with the instructions to try it ourselves?

Tank You for Your effort.

Comment by Danny Hatcher (Inactive) [ 05/Dec/18 ]

Hello Den,

If you attach your WiredTiger.wt and WiredTiger.turtle files to this ticket, I can attempt a recovery using an internal tool. However, if that does not work then the only other recourse would be for you to restore from a backup or perform an initial sync from a healthy replica set node.

Thank you,

Danny

Generated at Thu Feb 08 04:48:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.