[SERVER-40067] WiredTiger.wt corrupted Created: 11/Mar/19  Updated: 20/Mar/19  Resolved: 20/Mar/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Assignee: Eric Sedor
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2012


Attachments: Zip Archive DBrepairWT.zip     Zip Archive WTFiles3202019.zip     File WiredTiger.turtle     File WiredTiger.wt     File WiredTiger.wt     Zip Archive WiredTiger.zip     Zip Archive mongo14032019.zip     File repairedFrom03132019.tgz    
Participants:

 Description   

Hi Support,

It looks like wiredtiger.wt file has been corrupted. As per below error. It is possible to repair this file?

2019-03-07T14:14:52.704+0000 I CONTROL [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2019-03-07T14:14:52.704+0000 I CONTROL [initandlisten] db version v3.2.1
2019-03-07T14:14:52.704+0000 I CONTROL [initandlisten] git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2
2019-03-07T14:14:52.704+0000 I CONTROL [initandlisten] allocator: tcmalloc
2019-03-07T14:14:52.704+0000 I CONTROL [initandlisten] modules: none
2019-03-07T14:14:52.705+0000 I CONTROL [initandlisten] build environment:
2019-03-07T14:14:52.705+0000 I CONTROL [initandlisten] distmod: 2008plus
2019-03-07T14:14:52.705+0000 I CONTROL [initandlisten] distarch: x86_64
2019-03-07T14:14:52.705+0000 I CONTROL [initandlisten] target_arch: x86_64
2019-03-07T14:14:52.705+0000 I CONTROL [initandlisten] options: { config: "C:\Program Files\test\APM_Mongo\etc\Mongo.yaml ", storage: { dbPath: "D:\test\Data", directoryPerDB: true, journal:

{ enabled: false }

}, systemLog: { destination: "file", logAppend: true, path: "C:\Program Files\test\APM_Mongo\logs\mongo.log" } }
2019-03-07T14:14:52.706+0000 I - [initandlisten] Detected data files in D:\test\Data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2019-03-07T14:14:52.706+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=114G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),,log=(enabled=false),
2019-03-07T14:14:52.731+0000 E STORAGE [initandlisten] WiredTiger (0) [1551968092:731984][16404:140720579154768], file:WiredTiger.wt, connection: read checksum error for 4096B block at offset 12288: block header checksum of 696200596 doesn't match expected checksum of 730211347
2019-03-07T14:14:52.731+0000 E STORAGE [initandlisten] WiredTiger (0) [1551968092:731984][16404:140720579154768], file:WiredTiger.wt, connection: WiredTiger.wt: encountered an illegal file format or internal value
2019-03-07T14:14:52.731+0000 E STORAGE [initandlisten] WiredTiger (-31804) [1551968092:731984][16404:140720579154768], file:WiredTiger.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2019-03-07T14:14:52.732+0000 I - [initandlisten] Fatal Assertion 28558
2019-03-07T14:14:52.732+0000 I - [initandlisten]

***aborting after fassert() failure

 

 



 Comments   
Comment by Eric Sedor [ 20/Mar/19 ]

mclass, we've attempted a repair and attached the results as repairedFrom03132019.tgz.

Importantly, please store a copy of these repaired files with the copy of the dbpath from that time before starting mongod, as a backup. You may need to return to this backup as you explore how to protect your system against further issue.

Again, please consider MongoDB 4.0.

Comment by Michael [ 20/Mar/19 ]

Hi Eric,

 

Thank you, I have attached zip "WTFiles3202019.zip" containing WT from 13th and also from the 4.0 repair.

 

Comment by Eric Sedor [ 20/Mar/19 ]

Please provide the .wt and .turtle file as of your Mar 14 2019 11:44:10 AM GMT comment. We can try our internal repair script one more time from that point.

Going forward, we strongly recommend you look into your disk setup (and our other recommendations above) given the rapid speed with which corruption occurred again. And we strongly recommend you upgrade to MongoDB 4.0, which has improved repair capabilities.

Again, 3.2 reached end of life on September 2018. Unfortunately, we cannot continue to provide repair attempts on unsupported versions.

Comment by Michael [ 20/Mar/19 ]

Hi Eric,

I see you have closed the ticket however may I ask. I have installed Mongo 4 as you suggested and run the repair against mongo 3.2 data directory which has completed. However when I run the mongo 3.2 service it logs the following and the service stops.

 

874+0000 I CONTROL [initandlisten] MongoDB starting : pid=48504 port=27017 dbpath=D:\xxxx\Data 64-bit host=WINDOWS-QQUTJK2
2019-03-19T19:44:28.874+0000 I CONTROL [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2019-03-19T19:44:28.874+0000 I CONTROL [initandlisten] db version v3.2.1
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] allocator: tcmalloc
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] modules: none
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] build environment:
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] distmod: 2008plus
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] distarch: x86_64
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] target_arch: x86_64
2019-03-19T19:44:28.875+0000 I CONTROL [initandlisten] options: { config: "C:\Program Files\xxxx\APM_Mongo\etc\Mongo.yaml ", storage: { dbPath: "D:\xxxx\Data", directoryPerDB: true, journal:

{ enabled: false }

}, systemLog: { destination: "file", logAppend: true, path: "C:\Program Files\xxxx\APM_Mongo\logs\mongo.log" } }
2019-03-19T19:44:28.876+0000 I - [initandlisten] Detected data files in D:\xxxx\Data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2019-03-19T19:44:28.876+0000 I STORAGE [initandlisten] Detected WT journal files. Running recovery from last checkpoint.
2019-03-19T19:44:28.876+0000 I STORAGE [initandlisten] journal to nojournal transition config: create,cache_size=114G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2019-03-19T19:44:29.060+0000 E STORAGE [initandlisten] WiredTiger (-31802) [1553024669:60746][48504:140720579154768], txn-recover: unsupported WiredTiger file version: this build only supports major/minor versions up to 1/0, and the file is version 3/0: WT_ERROR: non-specific WiredTiger error
2019-03-19T19:44:29.060+0000 E STORAGE [initandlisten] WiredTiger (-31802) [1553024669:60746][48504:140720579154768], txn-recover: Recovery failed: WT_ERROR: non-specific WiredTiger error
2019-03-19T19:44:29.063+0000 I - [initandlisten] Assertion: 28718:-31802: WT_ERROR: non-specific WiredTiger error
2019-03-19T19:44:29.063+0000 I STORAGE [initandlisten] exception in initAndListen: 28718 -31802: WT_ERROR: non-specific WiredTiger error, terminating
2019-03-19T19:44:29.063+0000 I CONTROL [initandlisten] dbexit: rc: 100

 

I have also attached the repaired wiredtiger.wt and wiredtiger.turtle in zip "DBrepair.WT.zip".

Comment by Eric Sedor [ 14/Mar/19 ]

mclass, it looks like there are several important errors, warnings, and restarts in the logs prior to the checksum error, including an indication that the disk ran out of space.

We do try to help with corruption in earlier versions where possible. But MongoDB 3.2 reached end of life on September 2018, and the SERVER project is for bugs and feature requests for active versions of MongoDB.

At this time we'd like to ask that you look into using MongoDB 4.0's --repair option and consider our recommendations above. To clarify about backing up the path, we were not suggesting anything fancy: just make a copy of the dbpath so that you don't run --repair on the only copy of the files.

For further assistance troubleshooting, please post on the mongodb-user group or on Stack Overflow with the mongodb tag.

Comment by Michael [ 14/Mar/19 ]

Hi Eric,

I replaced the files you provided however after a short time the mongo service stops. I can see in the logs which I attached that the same error message appears towards the end of the log file starting @ 2019-03-14T10:35:55.122+0000

 

 

Comment by Eric Sedor [ 13/Mar/19 ]

Understood. Since this has happened again please take note of the following guidelines to help mitigate any issues related to unreliable storage layers or server failures.

I've attached a repair attempt of the files you provided (dated Mar 13 2019 01:56:52 PM GMT-0400). Please extract these files and replace them in your $dbpath

Comment by Michael [ 13/Mar/19 ]

Hi Eric,

 

Thank you for the feedback, I don't know much about Mongo and backing up the path. 

I had this same problem last year which was resolved under ticket https://jira.mongodb.org/browse/SERVER-34534?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

 

I have attached the log and turtle file.

Comment by Eric Sedor [ 13/Mar/19 ]

Hello, as a first step, can you please:

  • Download MongoDB 4.0
  • Make a copy of your current dbpath as a safeguard
  • Using version 4.0, run mongod --repair on the dbpath
  • Restart the 3.2.1 version of mongod

If this is not successful, let us know and please also provide

  • the safeguarded copy of the WiredTiger.turtle file in addition to the WiredTiger.wt file already attached
  • The logs from the 4.0 mongod --repair attempt and subsequent restart attempt

Thank you

Comment by Keith Bostic (Inactive) [ 11/Mar/19 ]

Program Management Note

  • Status set to Needs Verification after move.
Generated at Thu Feb 08 04:53:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.