[SERVER-28077] mongodb fassert v3.4.1 when encountering invalid data format Created: 23/Feb/17 Updated: 27/Oct/23 Resolved: 28/Feb/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Melody [X] | Assignee: | Unassigned |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Steps To Reproduce: | Reproduce easily just by running inserting benchmark |
| Participants: |
| Description |
|
I encountered an annoying when perform inserting with Mongodb v3.4.1. Options: ./mongod --dbpath /home/df --logpath mgo_test.log --logappend --oplogSize 50000 --replSet data --storageEngine wiredTiger --wiredTigerCacheSizeGB 12 --directoryperdb --fork ./mongo "rs.initiate()" |
| Comments |
| Comment by Michael Cahill (Inactive) [ 28/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Melody, I'm glad the issue is resolved. I will close this issue. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 28/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Michael Cahill, First, I want to say thanks for your correction of my log format to make it more readable. After careful check, it turns out that the crash roots in the bad memory chips. I replaced these memory chips with new ones and the problem has been solved. Thank you for your help and patience. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Cahill (Inactive) [ 28/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Melody, you will need to sort out the block device and filesystem issues before it makes sense to test using MongoDB or any other database. We assume that the filesystem provides POSIX semantics, which includes reading back the same data that we write. Have you considered testing the block device as part of a RAID mirror with one or more block devices that are known to be reliable? That may help isolate where the data is being corrupted. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 27/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Running the same test above using xfs, also crashed with different vmcore-dmesg.txt.
And Mongodb log shown as below:
Is there anything wrong with my machine configration? Or Mongodb requires specifical config? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 27/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Michael Cahill, First I need to explain that /dev/dfa is a block device, and it is the SSD product of Shannon. I tried the same test using ext4 in the other PC and encounter a different issue. The machine crashed and reboot during running. vmcore-dmesg:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Cahill (Inactive) [ 24/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Melody, what type of device is /dev/dfa? What mount flags are used? Are any errors logged in dmesg output during the run? This type of error indicates that MongoDB wrote a block with some data, then later when it read back the same block, it got different data. In other words, the data was corrupted in between writing to the filesystem and reading back in again. That usually indicates bugs in the filesystem or block storage. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 24/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ramón, I run the test again using XFS instead of ext4, and also got the crash. xfs_repair -n /dev/dfa
Look forward to your response. Haiyan | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 24/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I run the test again using XFS instead of ext4, and also got the crash. I checked xfs filesystem with xfs_repair and the print info listed below. Does it mean that the filesystem have no consistency problem? xfs_repair -n /dev/dfa
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Melody [X] [ 24/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The filesystem used is ext4. I have already tried to run using XFS filesystem, and it came the same result. I did not use fsck, and what do I expect to do with fsck? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 23/Feb/17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for opening a ticket Melody. In the log I see the following:
This can be easily caused by a faulty storage layer, so the first order of business is to check the integrity of your disks. Also, the logs show you're not using XFS – what filesystem are you using? Have you run fsck on this filesystem? Thanks, |