[COMPASS-4547] Export/Import of very large collection loses documents Created: 21/Dec/20  Updated: 16/Sep/21  Resolved: 27/Jan/21

Status: Closed
Project: Compass
Component/s: Import/Export
Affects Version/s: 1.23.0
Fix Version/s: No version

Type: Bug Priority: Major - P3
Reporter: Lester Waters Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Compass v1.23.0 on Windows 10


Attachments: PNG File Compass_Export-Import.png    
Documentation Changes: Not Needed

 Description   

Problem Description

I created a large test collection with 152 million small documents, totalling 12.9GB. I used COMPASS to export it as JSON. The export ran and upon completion, COMPASS displayed a message stating it completed AND also dropped the collection itself (!!). Sadly, I did not grab a screen shot of the message.  No error was displayed, and a collection.json document was created as expected (12.9GB in size).

I then imported the document into a new collection.  The process ran for slightly longer than the export time... and completed to 100%.  But only 15.1 million documents were created.

I suspect that it was the EXPORT side that failed, although it reached 100%.. Why it also dropped the source collection is a mystery. I have been unable to reproduce the issue, but will attempt to do so after the many hours spent creating the NOW LOST test data.

Steps to Reproduce

I will recreate the large dataset and attempt to reproduce on v1.24.6.

Expected Results

Export of large datasets should work fine. 

Under no circumstances should the Compass Export process also DROP the source collection.

Import should fail if the import file is syntactically incorrect.

Actual Results

Documents were lost, likely during export. Import ran without any complaints. Does this mean the import file has all the closing syntax in place? 

Additional Notes

A screen shot of the approximate collection statistics prior to export is attached, along with the statistic after re-importing.

I can provide the export data file (12.9GB), if requested. It is all dummy data.

 

 



 Comments   
Comment by Massimiliano Marcon [ 06/Jan/21 ]

lesterw@iotahoe.com the import/export functionality was actually rewritten a few months back so it is not the same problem that was reported on Reddit. Still, data should not be dropped.

It seems also very odd that the collection was dropped, that is something that we have never seen before: we are definitively going to try to reproduce it. It'd be useful to have the test data so we can try to reproduce the problem. If you generated it with a script, you can also share the script – much easier than sharing 13GB of data.

Comment by Lester Waters [ 21/Dec/20 ]

It appears that this is a known issue: https://www.reddit.com/r/mongodb/comments/d5b6fl/mongodb_compass_export_full_collection_limits/

Given this, Export should be fully disabled until it is fixed.  Also, why on earth did Export also DROP my collection?!?

Generated at Wed Feb 07 22:36:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.