[CSHARP-1614] Using GridFS (Legacy) with multiple MongoClients throws Exception Created: 29/Mar/16  Updated: 30/Mar/16  Resolved: 30/Mar/16

Status: Closed
Project: C# Driver
Component/s: GridFS
Affects Version/s: 1.11, 2.2.3
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Chris Gårdenberg Assignee: Robert Stam
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

If I have a main MongoClient that contains files I want to move from one server to another MongoClient (different connectionstring)

I get "InvalidOperationException: A nested call to RequestStart was made that is not compatible with the existing request."

Example code to reproduce the problem:

var guiDb = new MongoClient("mongodb://server1:27017/GridFSTest").GetServer().GetDatabase("GridFSTest");
var fs = guiDb.GetGridFS(new MongoGridFSSettings() { Root = "TestArea" });
var allFiles = fs.FindAll().ToList();
 
var server = new MongoClient("mongodb://server2:27017/GridFSTestTracker").GetServer().GetDatabase("GridFSTest");
 
var gridfs = server.GetGridFS(new MongoGridFSSettings()
{
	Root = "TestArea"
});
 
foreach (var file in allFiles)
{
	gridfs.Upload(file.OpenRead(), file.Name, new MongoGridFSCreateOptions { Id = file.Id, ContentType = file.ContentType, UploadDate = DateTime.Now, Metadata = file.Metadata }); // This row throws the exception
}



 Comments   
Comment by Robert Stam [ 30/Mar/16 ]

Just a heads up that in the 2.x GridFS API you are no longer allowed to set the file id yourself (the same is true across all drivers).

Instead you should put your id in the metadata for the GridFS file.

Comment by Chris Gårdenberg [ 30/Mar/16 ]

Many thanks, will use the version that downloads the file to memory first. (We need to set our Id ourself, since we use that as an identifier)

Comment by Robert Stam [ 29/Mar/16 ]

To copy a GridFS file from one server to another using the 2.x GridFS API it is possible to "stream" the file from the source to the destination so that the file being transferred is never entirely in memory.

Sample code:

private static void CopyFile(MongoClient fromClient, MongoClient toClient, string filename)
{
    var fromDatabase = fromClient.GetDatabase("test");
    var fromGridFS = new GridFSBucket(fromDatabase);
 
    var toDatabase = toClient.GetDatabase("test");
    var toGridFS = new GridFSBucket(toDatabase);
 
    var filenameFilter = Builders<GridFSFileInfo>.Filter.Eq(i => i.Filename, filename);
    var sortByDescendingUploadDateTime = Builders<GridFSFileInfo>.Sort.Descending(i => i.UploadDateTime);
    var findOptions = new GridFSFindOptions { Sort = sortByDescendingUploadDateTime, Limit = 1 };
    var fileInfo = fromGridFS.Find(filenameFilter, findOptions).Single();
 
    using (var fromStream = fromGridFS.OpenDownloadStream(fileInfo.Id))
    {
        toGridFS.UploadFromStream(filename, fromStream);
    }
}

Comment by Robert Stam [ 29/Mar/16 ]

To copy a GridFS file from one server to another using the Legacy API the source file must be downloaded in its entirety from the source server and then uploaded to the destination server.

Sample code:

private static void CopyFile(MongoClient fromClient, MongoClient toClient, string filename)
{
    var fromServer = fromClient.GetServer();
    var fromDatabase = fromServer.GetDatabase("test");
    var fromGridFS = fromDatabase.GridFS;
 
    var toServer = toClient.GetServer();
    var toDatabase = toServer.GetDatabase("test");
    var toGridFS = toDatabase.GridFS;
 
    using (var memoryStream = new MemoryStream())
    {
        fromGridFS.Download(memoryStream, filename);
        memoryStream.Position = 0;
        toGridFS.Upload(memoryStream, filename);
    }
}

If the files are too large to fit in memory you could modify this code to use temporary files on disk.

Comment by Robert Stam [ 29/Mar/16 ]

The Legacy GridFS API uses RequestStart at the beginning of a download to ensure that all chunks are read from the same server. This is potentially important when downloading a GridFS file from a secondary in a replica set.

In my code sample in the previous comment RequestStart is called by OpenRead, and the "request" remains in effect until the fromStream is closed (i.e. when the file is no longer being downloaded).

In the Legacy API RequestStart is a way of "pinning" a thread to a connection, so that all operations on that thread will be done using the same connection to a single server. Because the request is tied to the thread, there can only be one RequestStart in effect at a time (although nested calls to RequestStart are allowed as long as they are compatible with the current request).

In this case, Upload is is calling RequestStart again but this time for the destination server. This nested RequestStart is not compatible with the RequestStart that is already in effect by the OpenRead.

The net result is that the Legacy API does not support "streaming" a GridFS file from one server to another, as that would require interleaving database operations between the two servers and the use of RequestStart prevents that.

There are two workarounds:

1. If you need to use the Legacy API, download the source file in its entirety before uploading it to the destination server
2. Use the new GridFS API introduced in 2.x, which doesn't use RequestStart and does support "streaming" a GridFS file from one server to another

Sample code to follow.

Comment by Robert Stam [ 29/Mar/16 ]

I can reproduce using the following simplified code:

private static void CopyFile(MongoClient fromClient, MongoClient toClient, string filename)
{
    var fromServer = fromClient.GetServer();
    var fromDatabase = fromServer.GetDatabase("test");
    var fromGridFS = fromDatabase.GridFS;
 
    var toServer = toClient.GetServer();
    var toDatabase = toServer.GetDatabase("test");
    var toGridFS = toDatabase.GridFS;
 
    using (var fromStream = fromGridFS.OpenRead(filename))
    {
        toGridFS.Upload(fromStream, filename);
    }
}

Explanation and workaround to follow...

Comment by Robert Stam [ 29/Mar/16 ]

Thanks. I'll try to repro locally using two standalones. I don't think the version of the servers involved matters, but it's good to know them in case it does.

Comment by Chris Gårdenberg [ 29/Mar/16 ]

They are standalone servers on different machines, no replica set configured.

Version on source is 3.3.3, version on target is 2.4.3.

If I only run the Upload with a single Connection (without source client), it works.

Comment by Robert Stam [ 29/Mar/16 ]

Can you tell me what kind of servers are involved? Are they replica sets?

Generated at Wed Feb 07 21:40:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.