[SERVER-53628] Corrupt wt file, checksum validation levels and integrity check Created: 07/Jan/21  Updated: 22/Jun/22  Resolved: 10/Feb/21

Status: Closed
Project: Core Server
Component/s: GridFS, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Tom Decsi Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I am facing some issues getting mongoDB running stable again after a disk failure and I'am hoping to get some conclusive answers by posting this here.

We are running the following:

  • MongoDB 3.4.2 (GridFS) with WiredTiger engine 2.9.1
  • This is a single instance approximately 7 TB disk size (I know this is not optimal, but due to lack of resources, we can only deploy 1 instance currently)
  • Database is intensively used (CRD operation, no update operations)
  • Collections can become quite large, so collections are sharded to avoid big files
  • We do not configure the compression ourselves, so by default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes

The following happened:

  • At some point in time, we faced a disk failure, resulting in disk level data corruption and eventually an unclean mongoDB shutdown
  • After resolving disk I/O issue, we tried to start mongoDB again (recover the data from last clean checkpoint) but this failed due to read checksum errors. Example error message at startup: 

    WiredTiger error (0) [1608775397:914574][2001:0x7f2365280d40], file:store/index-22521-1925627743013728369.wt,
    WT_SESSION.checkpoint: read checksum error for 4096B block at offset 1343488: block header checksum of 4198219623 doesn't match expected checksum of 832517012

  • Analyzing the *.wt file revealed corrupt indexes. Deleting and rebuilding them enabled us to start mongoDB again but unfortunately it exited later when inserting data or reading data. Example error message:

    WiredTiger error (0) [1608843541:169607][1978:0x7f137e0f7700], file:store/index-22520-1925627743013728369.wt, WT_CURSOR.insert: read checksum error for 12288B block at offset 446464: block header checksum of 1107817407 doesn't match expected checksum of 3384971618

  • This suggests more data files are corrupted. Running a repair action on a 7 TB database would take too long, so we decided to restore a backup from a day before (before the disk IO issue was detected).
  • Backups are created daily by copying the underlying data files from a running mongoDB instance. Note that journaling is enabled on the same logical volume.
  • Backup files were restored successfully and mongoDB running fine again.
  • Unfortunately, after 3 days, mongoDB exited again due to a checksum error. This suggests we have restored corrupted data from our backup. Example error message: 

    2021-01-02T04:28:59.249+0800 E STORAGE [conn45] WiredTiger error (0) [1609532939:249830][1952:0x7fe688738700], file:store/index-14545-1925627743013728369.wt, WT_CURSOR.search_near: read checksum error for 12288B block at offset 2908160: block header checksum of 0 doesn't match expected checksum of 4186336817

  • We were able to resolve this particular corruption (deleted) and mongoDB is running again

Of course, this is not acceptable; this might be a ticking time bomb as other data blocks might be corrupted and mongoDB would exit whenever they are accessed.

So a couple of questions I hope to find an answer to:

  • To my understanding WiredTiger does some initial integrity checks at Mongo startup. Which checks are those? I suspect collection and index checksums only, is this statement correct? If not, which integrity checks are performed at startup?
  • A block consists of a page header structure, a block header structure and a chunk of data. Snappy compression is the default, meaning that the checksum is not stored in the page header and checksum validation is taken care of by snappy. Is this statement correct?
  • Is there a way we can do block level integrity checks (aka snappy) to ensure data integrity for the whole database. I understand this would be an intrusive operation, but we could do this in a phased approach (e.g per collection, shard, ...) Is any useful documentation, references available concerning these kind of issues? I can imagine we are not the first once suffering from this.
  • I would be extremely grateful for any other suggestion how to perform a complete data integrity check on a running MongoDB. Repair actions would be secondary, the primary goal is to identify possible integrity issues if any. 


 Comments   
Comment by Tom Decsi [ 04/Feb/22 ]

Hi Brajmohan,

You can run db.collection.validate script to identify the corrupt indexes. We eventually dropped all corrupt collections as we could not afford downtime or performance impact. Data loss of in our case older data was accepted. Mongo running fine afterwards ... Probably not the solution you were hoping for ...

Rgds

Tom

Comment by Brajmohan Sharma [ 04/Feb/22 ]

Hi Tom Decsi,

We are facing the same issue. How you analyzing the *.wt file to check corrupt indexes names. How you made instance up for Deleting and rebuilding them. We are unable to start the mongodb service.

Many Thanks

Braj Mohan

Comment by Edwin Zhou [ 10/Feb/21 ]

Hi tom.decsi@itinq.com,

Unfortunately we have no implementation available to run --repair on only select collections.

To avoid a problem like this in the future, it is our strong recommendation to:

Best regards,
Edwin

Comment by Tom Decsi [ 10/Feb/21 ]

Hi Edwin,

Thanks for your reply. Yes, we are able to run the db.collection.validate script and several collections were identified as being corrupted. Pls note that this is not preventing Mongo from starting up. Mongo starts fine.

But whenever data is accessed in on of those collections, Mongo will exit due to WiredTiger panic error, as we have seen before.

{{mongod --repair }}is an option, but this would take a whole week according to our estimates. Do you know if we can execute the repair only on corrupted collections instead of the whole database? Or any other options we may consider (besides just dropping the corrupted ones)?

Comment by Edwin Zhou [ 08/Feb/21 ]

Hi tom.decsi@itinq.com,

We'd love to hear back from you about your disk corruption!

Were you able to try running db.collection.validate on the affected collections? After validating collections, I recommend trying mongod --repair. This may remove some documents, but it should eliminate any corruption that prevents mongod from starting up.

Thanks,
Edwin

Comment by Edwin Zhou [ 21/Jan/21 ]

Hi tom.decsi@itinq.com,

MongoDB 3.4 reached end of life in January of 2020. But we can provide limited guidance on this issue. As you've identified, this appears to be a disk corruption.

First, make a complete copy of the database's $dbpath directory to safeguard so that you can work off of the current $dbpath.

The best way to look for corruption is to run db.collection.validate on the affected collections. Index corruption can be solved by reindexing, which you've mentioned you've done in your steps. After validating collections, I recommend trying mongod --repair. This may remove some documents, but it should eliminate any corruption that prevents mongod from starting up.

Best,

Edwin

Generated at Thu Feb 08 05:31:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.