[SERVER-28732] Problems with mongorestore on ZFS with ARC cache Created: 11/Apr/17 Updated: 14/Apr/17 Resolved: 13/Apr/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.4.1 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Andrey Kostin | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Participants: |
| Description |
|
Hi, I very need your help to identify the source of the problem. I'll try to describe an issue in details. What we have: I use the following command to restore collection:
After that I run db.fs.files.validate(true) in mongo shell and get the following:
The problem is not connected with this collection only, because I started to get this kind of errors with another collections' files and with index files 4 days ago (log example is attached). Hard drives are ok, their smart shows zero reallocated sectors and only 1010 power on hours. Also zpool scrub doesn't find any errors.
I don't think that this is a ZFS problem since the dump was created successfully and mongorestore doesn't print any errors when loading it. What is the best way to find buggy component (kernel, zfs, lxd, mongodb) in this situation? |
| Comments |
| Comment by Kelsey Schubert [ 13/Apr/17 ] | |
|
Hi lisio, Thank you for the update, I'm glad you were able to determine the root cause. Kind regards, | |
| Comment by Andrey Kostin [ 13/Apr/17 ] | |
|
The issue can be closed. The source of the problem is ARC cache of ZFS and the way it releases memory. It is so slow that when lxc or mongodb want for more memory it fails to free it in time and mongodb receives data with artifacts. | |
| Comment by Andrey Kostin [ 11/Apr/17 ] | |
|
Thanks, I'll try to upload it maybe later since now I have more information. | |
| Comment by Kelsey Schubert [ 11/Apr/17 ] | |
|
Hi lisio, Thank you for the detailed report. So we can better investigate the corruption that is occurring, would you please restore this dump into a clean mongod and upload the affected $dbpath to this secure upload portal? Please be aware that there is a 5GB maximum for files uploaded to the portal. There's an easy workaround though, which is to use the split command as follows:
This will produce a series of part.XX where XX is a number; you can then upload these files via the S3 portal and we'll stitch them back together. Thank you, | |
| Comment by Andrey Kostin [ 11/Apr/17 ] | |
|
I've just tried to do mongorestore once more and got this: mongorestore.txt |