[CSHARP-333] 1.2 Crashes our App Domain with a Exception: System.TimeoutException: Timeout waiting for a MongoConnection Created: 29/Sep/11  Updated: 02/Apr/15  Resolved: 06/Oct/11

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.2
Fix Version/s: 1.3

Type: Bug Priority: Critical - P2
Reporter: Kenny Inggs Assignee: Robert Stam
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Win 2008 R2 servers connecting to 3-node ubuntu RS on AWS


Attachments: Zip Archive Mongo1.2DriverTest.zip    
Issue Links:
Duplicate
duplicates CSHARP-406 Deadlock and TimeoutException when ac... Closed

 Description   

We experienced exactly the same symptoms as https://jira.mongodb.org/browse/CSHARP-323 last night when deploying to our live servers on AWS. Subsequently I tried going to the latest build of 1.3, with the same result. I reverted back to 1.1, and we are trying to replicate the issue locally. I will post more info when I managed to do this.



 Comments   
Comment by Robert Stam [ 05/Mar/12 ]

CSHARP-406 fixes the most likely cause for seeing this exception. Note that under heavy loads you might still see this exception.

Comment by Kenny Inggs [ 07/Oct/11 ]

A huge thanks for your help Robert and Aaron. I can confirm that this was our problem as well. I just couldn't replicate it, even with days of testing, since the only place we accidentally called Disconnect was in a 'Healthcheck' page that we only use for our load-balancer on our live environment, so our test environment never fell over.

Comment by Robert Stam [ 06/Oct/11 ]

When an error occurs on a connection that connection is closed and removed the pool. A new connection will be created as needed eventually to replace it.

Comment by Aaron Barker [ 06/Oct/11 ]

removed server.Disconnect() and now no problems and runs much faster too (which makes sense).

Interested to hear how the connection pooling works, like how does it prevent all of the available connections from being used up when exceptions occur?

Comment by Robert Stam [ 06/Oct/11 ]

I hadn't realized that the documentation at:

http://www.mongodb.org/display/DOCS/CSharp+Driver+Tutorial

didn't really discuss when it is appropriate to call Connect or Disconnect. I will add some documentation on that.

Because we use connection pooling connections are not normally closed when a database operation completes. Instead the connection just goes back to the connection pool to be used again later.

Connection pooling is all automatic. There is nothing you have to do (unless you are using RequestStart, and then you either have to use the using statement or make sure to call RequestDone to return the reserved connection to the connection pool).

Comment by Aaron Barker [ 06/Oct/11 ]

So, I thought I was supposed to be calling disconnect so that I would not run out of connections. Is it safe to assume when the connection goes out of scope it will be disconnected safely?

Can you point me to the documentation for the guidance on this so I can understand it a little better.

Glad to help, least I can do since I love all the work you and the rest of the mongoDB team is doing.

Comment by Robert Stam [ 06/Oct/11 ]

While I don't recommend you call Disconnect after every database operation, it should not have resulted in an error.

There is a bug in Disconnect that causes the count of connections in the connection pool to get out of sync. After you call Disconnect enough times the connection pool thinks it is full when in fact it is empty (because the poolSize value is messed up). The result is the exception you are seeing.

So even though you shouldn't be calling Disconnect, I'm glad you did because it helped to discover this bug.

Thanks a bunch for the code to reproduce this. It made working on this really easy. I really appreciate the effort you put into submitting an easily reproducible case.

Comment by Robert Stam [ 06/Oct/11 ]

You shouldn't be calling Disconnect. Can you remove the calls to Disconnect and try again?

Comment by Aaron Barker [ 05/Oct/11 ]

Visual studio 2010 console project attached.

Ran this twice with the same results I'm getting on the website. After about ~100 connections it dies with the error

Comment by Robert Stam [ 05/Oct/11 ]

If you are not setting the slaveOk value anywhere then the default will be false. SlaveOk is a client side setting in the driver.

Is it possible to attach a console application that reproduces this?

Comment by Aaron Barker [ 05/Oct/11 ]

1. It happens after 20+ requests/connections have been made from a given application. The server is standalone (so no primary/slave). I've watched the server's process and logs and there is nothing there that looks suspicious.

2. Where do I check for that? I'm not setting slaveOk explicitly anywhere in my code. Is it a server or driver setting?

I upgraded the server to 2.0 and continued getting the same issues. I just reverted c# driver to 1.1 (which has been working for months) and do not have the same issue.

Comment by Robert Stam [ 05/Oct/11 ]

@Aaron: a few questions:

1. Did this occur after the system had been running normally for some time, or when the very first request was made of the database? If after awhile, did anything else happen at that time (a server went offline, a new primary was elected, etc...)?

2. What was the value of slaveOk?

Comment by Aaron Barker [ 05/Oct/11 ]

I'm getting something very similar with the 1.2 driver. Once I get the connection timeout I continue to get it for my client app until I restart it, but I can continue querying the database from other apps without problems. Below is my stack trace:

aaron@tapconsulting.com

[MongoConnectionException: Unable to connect to server x.x.x.x:27017: Timeout waiting for a MongoConnection..]
MongoDB.Driver.Internal.DirectConnector.Connect(TimeSpan timeout) +777
MongoDB.Driver.MongoServer.Connect(TimeSpan timeout, ConnectWaitFor waitFor) +301
MongoDB.Driver.MongoServer.Connect(ConnectWaitFor waitFor) +78
MongoDB.Driver.MongoServer.ChooseServerInstance(Boolean slaveOk) +1008
MongoDB.Driver.MongoServer.AcquireConnection(MongoDatabase database, Boolean slaveOk) +365
MongoDB.Driver.MongoCursorEnumerator`1.AcquireConnection() +131
MongoDB.Driver.MongoCursorEnumerator`1.GetFirst() +91
MongoDB.Driver.MongoCursorEnumerator`1.MoveNext() +276
System.Linq.Enumerable.FirstOrDefault(IEnumerable`1 source) +4216356
MongoDB.Driver.MongoCollection.FindOneAs(IMongoQuery query) +167
TC.Services.MongoDB.MongoSession.Single(Int32 id) +383

Comment by Robert Stam [ 29/Sep/11 ]

Thanks. I'll keep an eye out for it.

Comment by Kenny Inggs [ 29/Sep/11 ]

I will add a stack trace as soon as I replicate it in our test environment which should be early tomorrow morning. The server was inactive at the time (right after the deployment), so I wouldn't expect a 'normal' Timeout. Will keep you posted.

Comment by Robert Stam [ 29/Sep/11 ]

Can you provide a stack trace to either determine that it really is the same issue as CSHARP-323, or if not, to determine exactly where the problem might be.

A "Timeout waiting for a MongoConnection" can sometimes just mean that your server is overloaded and not responding quickly enough. Or if the load is very high you might just need configuration changes to increase the size of the connection pool and/or the timeout value for waiting for a connection.

Comment by Kenny Inggs [ 29/Sep/11 ]

I missed something - We got the issue as we deployed version 1.2 of the driver (upgraded from 1.1)

Generated at Wed Feb 07 21:36:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.