[SERVER-33188] WiredTiger.wt, connection: read checksum error for 4096B block Created: 08/Feb/18 Updated: 06/May/18 Resolved: 09/Feb/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.2.11 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Paule Lecuyer | Assignee: | Mark Agarunov |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Debian Stretch |
||
| Attachments: |
|
| Participants: |
| Description |
|
Same problem as Our system went down due to an electrical blackout. Since then I cannot restart mongo, I get an error "WiredTiger.wt, connection: read checksum error for 4096B block" when I try to start it. mongod -repair fails with same error. Is there any tool capable to fix this ? Can you help me to repair this file ? Thanks |
| Comments |
| Comment by Ramon Fernandez Marina [ 06/May/18 ] |
|
Apologies for the radio silence plecuyer, we let this one fall through the cracks. Given the corruption you experienced, unfortunately there was no way to do any further recovery of the data. Regards, |
| Comment by Paule Lecuyer [ 10/Feb/18 ] |
|
Hello Mark, We'll apply your recommendations for the future, but at present time unfortunately our backups are too old, and we haven't configured replication for it... Thanks, |
| Comment by Mark Agarunov [ 09/Feb/18 ] |
|
Hello plecuyer, Unfortunately, this error indicates that there was corruption on the disk, most often cause by a faulty storage layer. In this situation, our best recommendation would be to resync the affected node or restore from a backup if possible. To prevent this type of problem in the future please take note of the following guidelines to help mitigate any issues related to unreliable storage layers or server failures.
Thanks, |
| Comment by Paule Lecuyer [ 09/Feb/18 ] |
|
Thanks for your quick answering Mark, I did as you said, but got another error. It seems that sizeStorer.wt file is also corrupted. Have a nice day. |
| Comment by Mark Agarunov [ 08/Feb/18 ] |
|
Hello plecuyer, Thank you for your report. I've attached a repair attempt of the files you provided. Please extract these files and replace them in your $dbpath and let us know if it resolves the issue. If you are still seeing errors after replacing these files, please provide the complete logs from the affected node(s) so that we can further investigate. Using systemd to stop mongod should perform a clean shutdown. Thanks, |
| Comment by Paule Lecuyer [ 08/Feb/18 ] |
|
After further investigations, it seems that mongodb corruption did not happen after electrical lockout, but after a "normal" system reboot. Just before rebooting, mongodb was heavily consuming CPU resources. Does the "mongodb.service" command uncleanly kill the mongo process, causing such corruption ? Paule. |
| Comment by Paule Lecuyer [ 08/Feb/18 ] |
|
I have attached the list of all files of mongodb data dir, and the WiredTiger* files |