[SERVER-34858] WiredTiger crash - checksum failed Created: 05/May/18 Updated: 06/Dec/22 Resolved: 09/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.6.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Florian Krebs | Assignee: | Backlog - Triage Team |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Host system: Ubuntu 16.04.4 with 32GB RAM (But MongoDB runs inside Docker-image) |
||
| Attachments: |
|
| Assigned Teams: |
Server Triage
|
| Operating System: | ALL |
| Participants: |
| Description |
|
Hey, we are using MongoDB within a machine learning project in Python. After starting the Python-script everything runs perfect first until at some point MongoDB crashes. It seems pretty random at which point that happens but it has happend three times in a row, now.
We are using MongoDB inside a Docker container and am currently using the image mongo:latest. The used server version is 3.6.4.
The message says that the calculated block checksum doesn't match the expected checksum. I will append the log at this issue. Unfortunately, we don't know why and when this error happens. The script is basically a loop iterating over the data, feeding it into a machine learning model and writing results back into MongoDB. It's always the same code that gets executed inside the loop but at some random iteration MongoDB crashes.
I hope you can help me because we really don't know how we can find and fix that issue. |
| Comments |
| Comment by Florian Krebs [ 09/May/18 ] | |
|
Okay, thank you. I will first try it with volumes and if that doesn't work open a thread in the group. | |
| Comment by Ramon Fernandez Marina [ 09/May/18 ] | |
|
I tried to reproduce this on macOS and Linux with no success. I'm no Docker expert, but (assuming the host's storage layer is healthy) you may want to try with a volume instead of a bind mount first. Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources. Regards, | |
| Comment by Florian Krebs [ 09/May/18 ] | |
|
It seems that it isn't possible for me to install MongoDB on the host machine directly. So is there anything I can do to prevent that issue from happening when running MongoDB in a Docker container? | |
| Comment by Florian Krebs [ 07/May/18 ] | |
|
Hey @Ramon Fernandez, unfortunately I cannot run MongoDB outside of Docker on this server due to the company policy. I am currently storing the data in a shared volume on an ext4 partition. I am not sure if shared volumes work together with MongoDB though after reading your linked note. Do you have any recommended way of storing data outside of the Mongo-Container? We want to store the data outside of the container so it isn't lost when cleanup removes stopped containers
Edit2: If running MongoDB on the host machine is the only way of finding a solution I will talk to the server administrator and ask him if he can temporarily install MongoDB for us. | |
| Comment by Ramon Fernandez Marina [ 07/May/18 ] | |
|
florian.krebs, the errors you're seeing seem to indicate that the storage layer is not providing enough durability guarantees; this typically happens when fsync() doesn't work as expected, which happens in shared mounts. Are you able to reproduce these issues outside of Docker? Regards, |