[CSHARP-302] Frequent connection errors result in a Connect/Disconnect storm Created: 10/Aug/11  Updated: 02/Apr/15  Resolved: 12/Sep/11

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: 1.1
Fix Version/s: 1.2

Type: Bug Priority: Major - P3
Reporter: Robert Stam Assignee: Robert Stam
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related

 Description   

Currently the C# driver closes all connections in the connection pool whenever an error occurs on any connection. This makes sense in a few cases (for example, the server went away), but in many cases it just makes things worse by causing a Connect/Disconnect storm where connections don't live for very long.

Furthermore, since Disconnect closes sockets as a background task, it is possible under heavy load to open connections faster than they are being closed, which can overwhelm the server.

This condition is triggered automatically by high exception rates under load, or can also be triggered by the user code itself by calling Disconnect frequently (some developers have erroneously assumed it is necessary to call Connect and Disconnect around every database operation).

The easiest way to tell if this is happening to you is to examine the server logs for frequent "end connection" log entries.

The driver should be very conservative about ever closing all the connections in the connection pool. It should only do so when it can be 100% certain that all connections are doomed and are going to fail anyway (at the moment I'm not sure how we could be 100% sure, so we might just stop doing this).



 Comments   
Comment by Robert Stam [ 12/Sep/11 ]

Added new Unknown MongoServerState and set the state of the MongoServerInstance to Unknown on error, which subsequently triggers a call to VerifyUnknownStates to determine the latest state of the replica set. Also, when a connection is closed it is now down synchronously, so that the driver will never be able to open new connections faster than it is closing old one.

Comment by Aristarkh Zagorodnikov [ 10/Aug/11 ]

It appears that https://jira.mongodb.org/browse/CSHARP-153 might be related to this one now. IOException almost always signals that there is some serious problem with the server.
I think that creating another exception (like MongoIOException, or MongoCommunicationException, but NOT mixing it with MongoConnectionException) and wrapping all network I/O places might be a good idea. Then, DetermineAction can just destroy pool only for MongoCommunicationExceptions.

Comment by Robert Stam [ 10/Aug/11 ]

Partially fixed this in master. The default action when there is an exception on a connection is now to discard only the affected connection, not the entire connection pool. Disconnect has also been made synchronous so it should no longer be possible to open new connections faster than old ones are being closed.

Generated at Wed Feb 07 21:36:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.