[SERVER-26533] WiredTiger library panic Created: 08/Oct/16 Updated: 13/Aug/18 Resolved: 13/Oct/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.2.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Shuo Wang | Assignee: | Kelsey Schubert |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | envm, rge, wtc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
A mongodb instance was unexpectedly closed in a replica set. I have to remove all data and start it in order to recover service as soon as possible. The log information as follow:
BTW: The version is v3.2.1. |
| Comments |
| Comment by Kelsey Schubert [ 23/Jan/17 ] | |||
|
Hi shuowang, Unfortunately, as Keith explained, there is no simple process to get the data from a particular block. Kind regards, | |||
| Comment by Keith Bostic (Inactive) [ 16/Jan/17 ] | |||
I don't think there's any simple solution (although anonymous.user may have a better idea than I do). In short, documents are encoded BSON, so there may or may not be useful text there, second, documents are generally compressed using snappy, and finally, if the block is truly corrupted, it may not be possible to decompress it. If the backing collection isn't compressed, I would suggest using a tool like hexdump to display the contents of the block. If the backing collection is compressed, it's going to be harder to get the information. If you want to use dd to copy out that 8192B block from the file and upload it into this ticket, I could try and crack it and let you know if I see anything useful. | |||
| Comment by Shuo Wang [ 16/Jan/17 ] | |||
|
Hi, Is there any way to get data from 8192B block at offset 1312190464? I want to know which document corruption in the MongoDB's collection.
Thanks | |||
| Comment by Alexander Gorrod [ 16/Oct/16 ] | |||
|
shuowang The error leading to MongoDB shutting down was:
That error means WiredTiger (the storage engine for MongoDB) detected that there was a disk corruption affecting the database. Is it possible that your underlying storage system is unreliable? I recommend running integrity checks on your hardware to look for problems. When MongoDB detects an on-disk data corruption, it exits immediately, since attempting to continue could lead to more extensive data corruption and/or data loss. The behavior you report is expected in this situation. | |||
| Comment by Shuo Wang [ 16/Oct/16 ] | |||
|
There was a mongodb instance crashed again at yesterday. But with the previous situation is different from the crashed instance can be restarted normally at this time. I've archived the diagnostic.data and mongodb.log. Please see the attachment. Thanks. | |||
| Comment by Kelsey Schubert [ 13/Oct/16 ] | |||
|
Hi shuowang, Since there isn't anything more we can do to progress our investigation at this time, I am going to close this ticket. Please let us know if you encounter this issue again, and we will reopen this ticket and continue to investigate. Kind regards, | |||
| Comment by Shuo Wang [ 12/Oct/16 ] | |||
|
Thanks. It's a very helpful suggestion. | |||
| Comment by Kelsey Schubert [ 12/Oct/16 ] | |||
|
Hi shuowang, Unfortunately, from the information currently available, we cannot identify the root cause of this issue. Would you please upgrade to MongoDB 3.2.10 (the latest release) and make sure you’re configured to track error messages on failure and capture diagnostic information? Please also confirm your filesystem isn’t running out of space or doesn’t otherwise fail to meet the requirements of MongoDB. If this error happens again, we will need an archive of the diagnostic.data, which would cover the period leading up to the failure, and confirmation no system errors occurred. In addition, I would recommend copying the the $dbpath before resyncing if possible as an examination of the data files may help progress the investigation. Thank you, | |||
| Comment by Shuo Wang [ 09/Oct/16 ] | |||
|
"A mongodb instance was unexpectedly closed" means that the process suddenly crashed by itself. After that, I tried to start it, but it didn't work. | |||
| Comment by Keith Bostic (Inactive) [ 08/Oct/16 ] | |||
|
Thank you for the report. We are investigating this issue, however, please be aware that determining the root cause of data corruption without a reproduction is challenging. Can you describe further what you mean by "A mongodb instance was unexpectedly closed"? Can you please tell us if there any errors recorded in the system logs, either before, or around the time of the failure? What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using? | |||
| Comment by Shuo Wang [ 08/Oct/16 ] | |||
|
There was another mongodb unexpectedly closed in a few days ago. The log information as follow: 2016-10-04T02:44:34.634+0800 E STORAGE [repl writer worker 16] WiredTiger (0) [1475520274:634770][8619:0x7f3d4b742700], file:collection-310--1292709067801170191.wt, WT_CURSOR.search: read checksum error for 12288B block at offset 139472896: calculated block checksum of 2825332407 doesn't match expected checksum of 2211965436 | |||
| Comment by Shuo Wang [ 08/Oct/16 ] | |||
|
How can I modify the description? It's look like weird with unrecognized character. |