[CSHARP-454] "Deadlock" in connection pool management Created: 25/Apr/12 Updated: 02/Apr/15 Resolved: 26/Apr/12 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | None |
| Affects Version/s: | 1.4.1 |
| Fix Version/s: | 1.4.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Aristarkh Zagorodnikov | Assignee: | Robert Stam |
| Resolution: | Done | Votes: | 0 |
| Labels: | c#, connections, deadlock, driver | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Description |
|
Recently, we started getting occasional unexplainable "hangs" that coincided with hitting the connection pool upper limit. I spent some time debugging, but could not repeat the case on test machine, so I waited till one of our servers "hanged", then dumped app server process and loaded it into WinDBG. I would like to note that this is a very disrupting issue, because sooner or later it brings down any server that is approaching a certain load. The most obvious fix is increasing connection pool limit, and it appears to solve the issue, but it doesn't feel like a proper long-term solution. |
| Comments |
| Comment by Aristarkh Zagorodnikov [ 27/Apr/12 ] |
|
Good to hear, waiting for 1.4.2 release =) |
| Comment by Robert Stam [ 26/Apr/12 ] |
|
This should be fixed now. Changes include: 1. RequestStart/Done now release the lock before calling out to other methods Using a new connection for Ping and VerifyState prevents these methods from being stalled when the connection pool is oversubscribed. Opening and closing a connection for just this purpose is not too much overhead because it's only done every few seconds (every 10 seconds at the moment). There are also minor changes to MongoConnection reflecting the fact that we can now have a connection that is not part of the connection pool. |
| Comment by Robert Stam [ 26/Apr/12 ] |
|
Thanks for reporting this. We're working on it. |
| Comment by Aristarkh Zagorodnikov [ 26/Apr/12 ] |
|
Also, it appears that this problem was there for some time, it just became more visible since |