[CSHARP-4900] Uploading a duplicate file larger than the original one causes errors in downloading the original file Created: 04/Jan/24  Updated: 24/Jan/24  Resolved: 24/Jan/24

Status: Closed
Project: C# Driver
Component/s: GridFS
Affects Version/s: 2.23.1
Fix Version/s: 2.24.0

Type: Bug Priority: Unknown
Reporter: Marek Kedziora Assignee: Oleksandr Poliakov
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to PYTHON-4146 Use insert_many to upload GridFS chun... Closed
Backwards Compatibility: Fully Compatible
Documentation Changes: Not Needed
Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?


 Description   

Summary

Uploading a duplicate file (file_id) larger than the original causes errors in downloading the original file. The operation completes with DuplicateKey but the chunks collection contains parts of second file. Next, an attempt to download the original file fails with GridFSChunk exception.

Please provide the version of the driver. If applicable, please provide the MongoDB server version and topology (standalone, replica set, or sharded cluster).

MongoDB 6.0.4/windows standalone/docker cluster with shards/windows cluster with shards

MongoDB.Driver.GridFS 2.23.1

How to Reproduce

using MongoDB.Driver;
using MongoDB.Driver.GridFS;
 
var r = new Random();
var content1 = new byte[10];
var content2 = new byte[36_000_000];
r.NextBytes(content1);
r.NextBytes(content2);
 
string fileId = "1";
 
var client = new MongoClient();
client.DropDatabase("TestDuplicate");
var db = client.GetDatabase("TestDuplicate");
var bucket = new GridFSBucket<string>(db);
 
bucket.UploadFromBytes(fileId, "unrelevant", content1);
 
try
{
    bucket.UploadFromBytes(fileId, "unrelevant", content2);
}
catch (MongoBulkWriteException e)
{
    Console.WriteLine(e.Message);
    // A bulk write operation resulted in one or more errors.
    // WriteErrors: [ { Category : "DuplicateKey", Code : 11000, Message : "E11000 duplicate key error collection: TestDuplicate.fs.chunks index: files_id_1_n_1 dup key: { files_id: "1", n: 0 }" } ]
}
 
try
{
    var read = bucket.DownloadAsBytes(fileId);
}
catch (GridFSChunkException e)
{
    Console.WriteLine(e.Message);
    // GridFS chunk 1 of file id 1 is missing
} 

 

Additional Background

The chunks collection contains one chunk from the original file (n: 0), and 64 chunks (n: 64..127) from the duplicate file.



 Comments   
Comment by Boris Dogadov [ 05/Jan/24 ]

Hi marek.kedziora@kdpw.pl 

Thanks for filing this issue and providing reproduction code.
We've reproduced this behavior and will be investigating further.

Please follow this ticket for further updates.

Comment by PM Bot [ 04/Jan/24 ]

Hi marek.kedziora@kdpw.pl, thank you for reporting this issue! The team will look into it and get back to you soon.

Generated at Wed Feb 07 21:49:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.