[SERVER-34858] WiredTiger crash - checksum failed Created: 05/May/18  Updated: 06/Dec/22  Resolved: 09/May/18

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.6.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Florian Krebs Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Host system: Ubuntu 16.04.4 with 32GB RAM (But MongoDB runs inside Docker-image)


Attachments: Text File MongoDB-error.log    
Assigned Teams:
Server Triage
Operating System: ALL
Participants:

 Description   

Hey,

we are using MongoDB within a machine learning project in Python. After starting the Python-script everything runs perfect first until at some point MongoDB crashes. It seems pretty random at which point that happens but it has happend three times in a row, now.

 

We are using MongoDB inside a Docker container and am currently using the image mongo:latest. The used server version is 3.6.4.

 

The message says that the calculated block checksum doesn't match the expected checksum. I will append the log at this issue. Unfortunately, we don't know why and when this error happens. The script is basically a loop iterating over the data, feeding it into a machine learning model and writing results back into MongoDB. It's always the same code that gets executed inside the loop but at some random iteration MongoDB crashes.

 

I hope you can help me because we really don't know how we can find and fix that issue.



 Comments   
Comment by Florian Krebs [ 09/May/18 ]

Okay,

thank you. I will first try it with volumes and if that doesn't work open a thread in the group.

Comment by Ramon Fernandez Marina [ 09/May/18 ]

I tried to reproduce this on macOS and Linux with no success. I'm no Docker expert, but (assuming the host's storage layer is healthy) you may want to try with a volume instead of a bind mount first.

Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Comment by Florian Krebs [ 09/May/18 ]

It seems that it isn't possible for me to install MongoDB on the host machine directly. So is there anything I can do to prevent that issue from happening when running MongoDB in a Docker container?

Comment by Florian Krebs [ 07/May/18 ]

Hey @Ramon Fernandez,

unfortunately I cannot run MongoDB outside of Docker on this server due to the company policy. I am currently storing the data in a shared volume on an ext4 partition. I am not sure if shared volumes work together with MongoDB though after reading your linked note. Do you have any recommended way of storing data outside of the Mongo-Container? We want to store the data outside of the container so it isn't lost when cleanup removes stopped containers

 
Edit:
The current command that I use is: 

docker run --name <name> -d -v <host directory>:/data/db -p 27017:27017 mongo:latest

Edit2: If running MongoDB on the host machine is the only way of finding a solution I will talk to the server administrator and ask him if he can temporarily install MongoDB for us.

Comment by Ramon Fernandez Marina [ 07/May/18 ]

florian.krebs, the errors you're seeing seem to indicate that the storage layer is not providing enough durability guarantees; this typically happens when fsync() doesn't work as expected, which happens in shared mounts.

Are you able to reproduce these issues outside of Docker?

Regards,
Ramón.

Generated at Thu Feb 08 04:38:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.