[SERVER-28666] WiredTiger error - file:WiredTiger.wt, WT_CURSOR.next: read checksum error Created: 07/Apr/17 Updated: 13/Aug/18 Resolved: 07/Apr/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.4.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marc Henri [X] | Assignee: | Kelsey Schubert |
| Resolution: | Done | Votes: | 0 |
| Labels: | envns, rpo, rpu, trcf, wtc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | Linux |
| Steps To Reproduce: | /etc/init.d/mongod_tw start |
| Participants: |
| Description |
|
After power failure, 2 MongoDB Servers just won't start again. With one I get: : WiredTiger error (0) [1491533102:978650][31243:0x70b9da9a2dc0], file:WiredTiger.wt, WT_CURSOR.next: read checksum error for 28672B block at offset 9068544: block header checksum of 0 doesn't match expected checksum of 4059665721 And the with other : WiredTiger error (-31802) [1491534981:938109][32667:0x65917b0f2dc0], file:collection-30-4121866540730348039.wt, WT_SESSION.open_cursor: /data2/instagram_bak/collection-30-4121866540730348039.wt: handle-read: pread: failed to read 4094 bytes at offset 2: WT_ERROR: non-specific WiredTiger error Would you be willing to try repairing our .wt files for both our servers separated by 2 set of files that i've attached ? And also would you be able to explain the methods used to perform the repair attempt ? Thanks |
| Comments |
| Comment by Kelsey Schubert [ 07/Apr/17 ] | ||||||||||||||||||||||||||||||||
|
Hi Cezam, Unfortunately, this indicates that there was additional corruption on disk following the power failure. In this situation, my best recommendation would be to resync the affected nodes or restore from a backup. Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support see our Technical Support page for additional resources. Kind regards, | ||||||||||||||||||||||||||||||||
| Comment by Marc Henri [X] [ 07/Apr/17 ] | ||||||||||||||||||||||||||||||||
|
Now set2 failed as well after going through 80% of the db. Here is the error trace
| ||||||||||||||||||||||||||||||||
| Comment by Marc Henri [X] [ 07/Apr/17 ] | ||||||||||||||||||||||||||||||||
|
Hi again, set1 ended up failing. Although repair ran much longer then prior to me sending you the files.
| ||||||||||||||||||||||||||||||||
| Comment by Marc Henri [X] [ 07/Apr/17 ] | ||||||||||||||||||||||||||||||||
|
Hey Thomas, I've launched a repair on both databases using your files and now awaiting result. Thank you Marc | ||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 07/Apr/17 ] | ||||||||||||||||||||||||||||||||
|
Hi Cezam, I've attached a tarball with repair attempts for both sets. Please extract and replace them in their respective paths, and let us know if it resolves the issue. Unfortunately, the repair process we use to attempt these repairs is not ready to be publicly shared. We're tracking the work to make repair and recovery of the WiredTiger storage engine more robust in Thank you, |