[SERVER-10304] Don't hold mutex while trying to establish connection to replica sets Created: 23/Jul/13 Updated: 10/Dec/14 Resolved: 30/Jan/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Client |
| Affects Version/s: | 2.5.1 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Mathias Stearn |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
There are a couple of places inside ReplicaSetMonitor that holds the _setsLock mutex while creating a new connection to the seed nodes of the set. This can be problematic in the case when the monitor decides to stop monitoring a set after getting continuous errors (basically it assumes that the shard has been removed), then another request will try to talk to the removed set. This will then prompt the monitor to recreate it from the cached seedlist. And this is done while holding the mutex. If it takes time for the set to error out, then it will be blocking all the other threads who wants to use the monitor to talk to the other sets as well. |
| Comments |
| Comment by Randolph Tan [ 26/Jul/13 ] |
|
Attached test patch to demonstrate the problem. |