[SERVER-27224] file:WiredTiger.wt read checksum error, mongodb won't start Created: 30/Nov/16 Updated: 13/Aug/18 Resolved: 14/Mar/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.2.9 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | rudolp esquilon | Assignee: | David Hows |
| Resolution: | Done | Votes: | 0 |
| Labels: | envm, rfi, rpu, trcf, wtc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
Hi, I got the same issue with this one: thank you! |
| Comments |
| Comment by rudolp esquilon [ 14/Mar/17 ] |
|
I see..anyway thank you |
| Comment by David Hows [ 14/Mar/17 ] |
|
Hi Rudolph, Sorry for the large delay in getting back to you. I've tried to get a repair of your data-set going, and got to the point you did. I then made some minor journal modifications to work around the issue you were seeing. From there, I started looking at the integrity of all your files. None of the database files were able to validate or be read, all were corrupted. I also did a manual examination of one of the internal catalog tables (_mdb_catalog.wt), which has a fixed BSON String format for data (unlike any of the data containing collections which have no fixed schema). In this file I found strings from JSON documents and other things that looked highly out of place. Given this, I believe that something external to MongoDB (likely the filesystem) has caused corruption in your database files. |
| Comment by Kelsey Schubert [ 20/Dec/16 ] |
|
Hi rud0lp20, Thanks for uploading the files. Unfortunately, this type of post-mortem requires a significant effort to analyze the files. We'll update this ticket after we conclude our investigation. Kind regards, |
| Comment by rudolp esquilon [ 15/Dec/16 ] |
|
Hi, are there any news? |
| Comment by rudolp esquilon [ 08/Dec/16 ] |
|
Hi, I've already uploaded all required files... |
| Comment by rudolp esquilon [ 07/Dec/16 ] |
|
ok thank you however we don't have backup files yet.... |
| Comment by Kelsey Schubert [ 06/Dec/16 ] |
|
Hi rud0lp20, Unfortunately, since the repair attempt was unsuccessful, my advice would be to perform an initial sync or restore from a backup. Before doing so, would you be able to provide the complete $dbpath for us to investigate this issue? I've created a secure upload portal where you can provide logs following the repair attempt and data files. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time. Thank you for your help, |
| Comment by rudolp esquilon [ 06/Dec/16 ] |
|
up..any news? |
| Comment by rudolp esquilon [ 01/Dec/16 ] |
|
Hi, 1. actually we reboot the server and didn't manually stop it thanks |
| Comment by Kelsey Schubert [ 01/Dec/16 ] |
|
Hi rud0lp20, Thank you for the answers, I have a few follow up questions to help us in our investigation.
Thanks again, |
| Comment by rudolp esquilon [ 01/Dec/16 ] |
|
hi Thomas, here is my answer |
| Comment by Kelsey Schubert [ 30/Nov/16 ] |
|
Hi rud0lp20, I've attempted a repair of the uploaded files. Please extract them and replace them in your dbpath. I have a few questions to get a better understanding of what happened here, but please understand that in cases like this we may not be able to identify the root cause from the information you provide.
Thank you, |
| Comment by rudolp esquilon [ 30/Nov/16 ] |
|
FYI [1480529755:805623][22119:0x7f5b50b16740], file:WiredTiger.wt, WT_CURSOR.next: read checksum error for 24576B block at offset 57344: calculated block checksum of 4001194994 doesn't match expected checksum of 4050905503 2016-11-30T10:55:50.367-0500 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=18G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0), |