[SERVER-31097] Two shards in cluster getting WT LIBRARY PANIC creating a simple index and every index retry crashes again Created: 14/Sep/17 Updated: 24/Sep/17 Resolved: 14/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | 3.4.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Lucas | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
When creating a background index in our cluster with 7 shards (600 mi documents) and in one collection sharded by a hased index, the server continuously crashes. We created this index:
After some time building the MongoDB crashed with this error:
I wil attach two log files. First one is the first crash (right after the index build start) and the second one is a subsequent crash. When I started the server with the option --noIndexBuildRetry, it stops the crashes. I will make initial sync in those two servers because I'm not confident if this did not corrupted any data or index in my database. |
| Comments |
| Comment by Lucas [ 24/Sep/17 ] |
|
Hello pasette, thanks for your comment. Unfortunately I don't have access anymore to these shards files because they have been replaced with new ones and a initial sync has been made. But I can say to you two things: 1. I already got this problem before ( 2. Another replicaset (without ANY connections to these shards) crashed today with the WT library panic error. I know it can be a diferent problem, but its really strange. I will attach log files and metrics files when this happen and all wired tiger files and I really wish that was diagnosed. It is very worrying that a database has so many chances of corrupting itself. Thanks. |
| Comment by Daniel Pasette (Inactive) [ 15/Sep/17 ] |
|
Hi Lucas, |
| Comment by Lucas [ 15/Sep/17 ] |
|
You do not even think this was strange on the part of the description when I said TWO different shards crashes trying to create the index? Two different dedicated servers crashing at the same time when we was creating the index? Those MongoDB servers are alive for more than one month and I already indexed this same collection weeks ago. And what do you mean about checking the integrity of the storage layer? Those shards are quite new to getting corrupted storages in this way, without any interruption and things like that. How can this data corruption happens? And I know about the SERVER project, but unfortunately I can't agree this isn't nothing of weird and something like a bug. Thanks. |
| Comment by Ramon Fernandez Marina [ 14/Sep/17 ] |
|
These error messages indicate that the data on disks is corrupt. Even if you were able to upload the data I don't think we would be able to reconstruct it, so I'd recommend the following:
Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources. Regards, |