[CSHARP-1547] Transient error handling Created: 27/Jan/16  Updated: 09/Mar/16  Resolved: 09/Mar/16

Status: Closed
Project: C# Driver
Component/s: Connectivity, Error Handling
Affects Version/s: 2.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Roberto Pérez Assignee: Unassigned
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:
  • Server side: a virtual machine hosted in Microsoft Azure running Ubuntu-14_04-LTS and MongoDB 3.2.1
  • Client side: A cloud service hosted in Microsoft Azure running Windows 2012 R2 and making calls to MongoDB using the .NET driver (2.1.0).


 Description   

We have a web application that queries a MongoDB database and as we have thousands of documents we paginate on MongoDB (I tell this but I do not think it matters) and if we do not interact with the web application for some minutes, then, when we start again the first request it fails (it is probably a timeout) and the MongoDB Driver throws this exception:
Type: MongoConnectionException,
StackTrace: at MongoDB.Driver.Linq.MongoQueryProviderImpl`1.Execute(Expression expression)
at MongoDB.Driver.Linq.MongoQueryProviderImpl`1.Execute[TResult](Expression expression)
at System.Linq.Queryable.Count[TSource](IQueryable`1 source)

Next requests go okay.

So far, we have wrapped all requests to MongoDB with Func<T> and Actions so we perform a simple retry strategy but this has some problems:
1) We have to wrap all MongoDB calls within Func<T> so there is high probability that we miss some statements
2) The retry works fine but the Exception is not thrown right away. It is thrown like 30-45 seconds after the request are sent.

So we would like to have some mechanism to handle these situations centrally (for example, in Entity Framework you can inject the transient error handling strategy by means of subclasses).



 Comments   
Comment by Roberto Pérez [ 02/Feb/16 ]

Okay, thanks Craig.

Let's close this ticket and wait for updates in #1343 ticket.

Regards

Comment by Craig Wilson [ 02/Feb/16 ]

If you'd like, you can file a separate feature request ticket for that. We have been discussing what is even possible internally between the drivers and have not come up with a good solution yet. We are thinking about it though.

UPDATE: In fact, here is the ticket about a retry policy: CSHARP-1343

Craig

Comment by Roberto Pérez [ 02/Feb/16 ]

Hello Craig again,

You really made my day ^^
I have repeated the steps and when setting maxIdleTime to 3 minutes, it works as expected.
mongoClientSettings.MaxConnectionIdleTime = new TimeSpan(0, 3, 0);

However if I set it to 3 minutes and 59 seconds, it keeps failing. 3 minutes it is a good value for us, anyway.

Now that we have got around the problem, should we care about transient errors or does the driver manage them and retries those requests?

Thank you so much,

Comment by Craig Wilson [ 02/Feb/16 ]

Great.

Azure has an appliance with a relatively low setting for killing off idle connections. In your connection string, you can set the maxIdleTime to be something under 4 minutes. This will tell the driver that if a connection has been idle for however long your setting is, then it will close the connection create a new one. This is ideal for your situation where your application may be idle for a period of time.

This setting is also controllable in azure and can be configured for up to 30 minutes. Between the driver and Azure, you can tweak these numbers to get the best performance.

Let me know if this works,
Craig

Comment by Roberto Pérez [ 02/Feb/16 ]

Hi Craig,

Thank you for you reply. Yes, everything is on Azure.

You can find the environment description in the Details --> Environment section.

Regards!

Comment by Craig Wilson [ 02/Feb/16 ]

Hi Roberto,

I have seen this behavior before. It has to do with idle connections that are closed by an external server or appliance. Could you provide some details on the environment you are running in? For instance, are your servers and/or client in Azure or AWS?

Craig

Generated at Wed Feb 07 21:39:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.