[SERVER-20724] Collection Corruption that won't fix with repairdatabase or mongodump Created: 01/Oct/15 Updated: 28/Oct/15 Resolved: 28/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MMAPv1 |
| Affects Version/s: | 3.0.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Allan Edwards | Assignee: | Ramon Fernandez Marina |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Steps To Reproduce: | run repairdatabase or mongodump on the given collection |
| Participants: |
| Description |
|
I have a collection that is about 300 gbs and it contains about 5.5 million docs. When I run repairDatabase or mongodump it gets to the 15th data file for the collection and both components fail. |
| Comments |
| Comment by Ramon Fernandez Marina [ 02/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
siliconplains44, can you please upload the following files?
That may be sufficient for us to investigate further. Also, can you post the output of:
where <dbpath> is the database path for this mongod instance? Thanks, | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 02/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
siliconplains44, we may only need the filestore.15 file – I've asked to see if that's the case, which would make the upload the simplest option. If we need more than that we'll evaluate your proposal. Thanks for your patience, | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Allan Edwards [ 01/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
The tared file is in the 400 gb size in range. There is not way I can upload this much data to you guys. Would you by chance be open to me giving you access to a server made from a snapshot of this data and then you guys remotly login to Google cloud and work on the data? | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 01/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the log siliconplains44; the error is being triggered at:
which could be caused by the data being corrupted on disk. This could have happened due to an error in the storage layer, so I'd recommend you look for storage errors in the system logs. I reckon that this may not yield any useful data: if the error happened a while back but was only detected now the logs may not be available. In order to rule out a bug in mongod as the cause of this problem I'd like to ask you to share this database with us so we can inspect the nature of the corruption. I've created an upload portal so you can send us data privately and securely. You'll need to split the data into chunks to be able to upload it, here's how:
This will create a set of part.NN files that you can upload; you'll need to perform the tar and split operations in a disk containing enough space. Alternatively you can upload the local.* and filestorage.* directly – the tar+split method is to reduce the number of files to upload. Thanks, | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Allan Edwards [ 01/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
Here is the dump...
| |||||||||||||||||||||||||||||||||||||||||||
| Comment by Allan Edwards [ 01/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
I will run the process again and send you the full dump. | |||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 01/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||
|
siliconplains44, can you please post the logs you get when you run the repair operation? What is the exact error message? |