[SERVER-22619] Mongo stopped due to WiredTiger unexpected checksum Created: 15/Feb/16  Updated: 23/Feb/16  Resolved: 23/Feb/16

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Rémi Alvado Assignee: Susan LoVerso
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 14.04 on a LXC container running on a Ubuntu 14.04 host
64 GB of memory
Cluster of 4 members all running the same version of mongodb : 3.2.1


Attachments: Text File mongo_stacktrace_2016-02-15.log     Text File mongo_stacktrace_2016-02-15T17-51.log    
Backwards Compatibility: Fully Compatible
Operating System: Linux
Participants:

 Description   

Hi,

since a few weeks now, one of my mongodb server randomly crashes. Sometimes, it occurs after a week days without issues, sometimes after a few seconds only.
The stacktrace is always quite the same (see attachment).
My configuration is :

systemLog:
    verbosity: 0
    quiet: false
    path: /var/log/mongodb/mongodb.log
    logAppend: false
    logRotate: rename
    destination: file
    timeStampFormat: iso8601-local
processManagement:
    fork: false
net:
    port: 27017
    bindIp: 127.0.0.1,10.0.1.98
    ipv6: false
    http:
        enabled: true
        JSONPEnabled: false
        RESTInterfaceEnabled: false
storage:
    dbPath: /var/lib/mongodb
    indexBuildRetry: true
    journal:
        enabled: true
    directoryPerDB: false
    engine: wiredTiger
    wiredTiger:
        engineConfig:
            cacheSizeGB: 8
            statisticsLogDelaySecs: 10
            journalCompressor: snappy
            directoryForIndexes: false
operationProfiling:
    slowOpThresholdMs: 100
    mode: slowOp
replication:
    replSetName: cluster1

It's always the same node that is failing. I haven't trashed the container yet to test if it can be caused by a corrupt file somewhere. This is my next step but I prefer to wait if you need some details about this issue.
Thanks.



 Comments   
Comment by Susan LoVerso [ 23/Feb/16 ]

User hardware issue.

Comment by Rémi Alvado [ 23/Feb/16 ]

Hi Sue, we are using SSD disks with RAID0. That might be the cause. Since I've deleted all data and resync the node, I've got no more issue of this kind. The 176Kb bad sector has probably been caused in the past, before the resync.
It seems the bug is not related to MongoDB but to a system failure (hardware or software). I think you can close this issue and delete the data.
Thanks for your investigation !

Comment by Susan LoVerso [ 22/Feb/16 ]

remi.alvado Thank you for sending the information.
We have looked at the file that gets the error. There is a very large chunk of zeroes in the collection with the error. There is a contiguous 176Kb range of zeroes in the file, including in the middle of a WT page. WT never writes large chunks of zeroes in that manner. Given that you said this always happens on the same node, periodically, can you check for or turn on logging of disk errors? What type of storage do you have? SSD or spinning media? As alexander.gorrod suggested earlier, the content of the file looks corrupted.

Comment by Rémi Alvado [ 19/Feb/16 ]

No problem I've just uploaded it

Comment by Kelsey Schubert [ 19/Feb/16 ]

Hi remi.alvado,

My apologies, I did not notice that we were missing part.30 earlier. Everything else is present, so once we have that file we can stitch them back together and continue to investigate this issue.

Thank you again,
Thomas

Comment by Rémi Alvado [ 19/Feb/16 ]

Hi Thomas, after another round of network issues, it finally worked. Can you check on your side that everything is fine ? You should have 33 files of 5300000000 Bytes and 1 of 33647360 Bytes (part.33)

Comment by Kelsey Schubert [ 18/Feb/16 ]

Hi remi.alvado,

Unfortunately, we are not seeing a number of files.

Please reupload the following 11 files:

  • part.10
  • part.12
  • part.14
  • part.15
  • part.16
  • part.17
  • part.18
  • part.23
  • part.24
  • part.25
  • part.28

To summarize, we are missing the following part ranges: (10, 12, [14-18], [23-25], 28)

Thank you for your help!
Thomas

Comment by Rémi Alvado [ 18/Feb/16 ]

I've finished uploading all the files despite a few network issues along the way Can you tell me if everything is ok on your side ?

Comment by Ramon Fernandez Marina [ 17/Feb/16 ]

Thanks Rémi, we'll wait to hear from you then. And yes, uploaded files are routinely deleted after some time.

Comment by Rémi Alvado [ 17/Feb/16 ]

Ok, I create a tgz file, split it into small pieces and upload them on S3. I'll put everything so it will be easier for you to investigate. Most of our sensitive data are encoded and I trust you to remove the files once the investigation is completed
I'll let you know once everything has been uploaded.
Thanks.

Comment by Ramon Fernandez Marina [ 17/Feb/16 ]

remi.alvado, unfortunately the limit can't be increased. There's an easy workaround though, which is to use the split command as follows:

split -d -b 5300000000 filename.tgz part.

This will produce a series of part.XX where XX is a number; you can then upload those files via the S3 portal and we'll stitch them back together.

Comment by Rémi Alvado [ 17/Feb/16 ]

Hi Sue,

why not but it's still huge :

root@mongo06:/data/backups/mongodb# ll -h collection-27--2609336273829901117.wt
-rw-r--r-- 1 root root 73G févr. 16 08:22 collection-27--2609336273829901117.wt
root@mongo06:/data/backups/mongodb# du WiredTiger* | awk '{ print; total += $1 }; END { print "total size: ",total,"KB" }' | tail -n 1
total size:  9372468 KB

It's not a big issue on my side : we have a pretty decent internet connection on our datacenter but it's much bigger than the 5GB that you have created on S3. Can you increase this limit please ? I'll join the output of ls -lR database-directory inside the tarball as directory_listing.txt if that's fine for you.

Comment by Susan LoVerso [ 17/Feb/16 ]

Hi remi.alvado,

Let's start in the other direction. Please send a tarball containing these files from the database directory:

  • Only the collection file showing the error, collection-27--2609336273829901117.wt
  • All WiredTiger* files
  • _mdb_catalog.wt
  • sizeStorer.wt
  • output of an ls -lR database-directory

Once I decode the WiredTiger metadata to see its checkpoint information, I may want to see some of the journal files. Does that help work around your sensitive data and make the size more manageable?

Comment by Rémi Alvado [ 17/Feb/16 ]

Hi Thomas, Hi Sue,

following Alexander's comment, I've re-sync my node with the cluster. I made a copy of my database just before doing so but directory structure is around 163 GB and contains some sensitive data (hashed passwords)
Do you need the full database or can I remove the collection containing sensitive data? I just had a look at the directory structure and saw that it's probably not that easy to remove just one collection Do you have some command I can execute to do it before creating a tarball?
Thanks

Comment by Susan LoVerso [ 16/Feb/16 ]

Hi remi.alvado, to be clear, please create the tarball of the affected database before attempting to restart mongod so that we can see the exact state of the database and its files when the error happened.

Comment by Kelsey Schubert [ 16/Feb/16 ]

Hi remi.alvado,

We would like to investigate this issue in more depth, can you please upload a tarball of the entire copy of the affected database, including the journal?

I have created a secure upload portal here to privately share your files with us. Please note that your files would only be visible to MongoDB employees examining this issue.

If the tarball is greater than 5GB, please let me know and I will create additional portals.

Thank you,
Thomas

Comment by Alexander Gorrod [ 15/Feb/16 ]

The information you uploaded is sufficient - it points to a corruption at the disk layer. Go ahead with the re-sync. I'd save the data aside until the resync completes if you have space.

Comment by Rémi Alvado [ 15/Feb/16 ]

Thanks Alexander ! Do you want me to give you some extra info before trashing data and syncing the container again ?

Comment by Alexander Gorrod [ 15/Feb/16 ]

The error message you report is due to a corruption in the database on disk. WiredTiger uses checksums to ensure that the content on disk is as expected, and reports the error message you see when a mismatch is encountered.

The best way forward is to re-sync the node from another member of the replica set.

Comment by Rémi Alvado [ 15/Feb/16 ]

It happened again a few minutes after the last restart

Generated at Thu Feb 08 04:00:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.