[SERVER-26740] Total connections is not stable and spiky when user authentication is enabled Created: 24/Oct/16 Updated: 08/Feb/23 Resolved: 09/Jun/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Networking, Security |
| Affects Version/s: | 3.2.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | 아나 하리 | Assignee: | Spencer Jackson |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
MongoDB cluster is consist of 28 shards and 15 mongos. MongoDB connections graph is spiky only when user authentication is enabled Not with disabled authentication. And java client driver return errors(like below) or slow-responses when user authentication is enabled. (I attached connections and query/sec graphs). [ERROR] [c.k.s.m.g.r.RequestExecuteCallable] execute (71): Too many threads are already waiting for a connection. Max number of threads (maxWaitQueueSize) of 400 has been exceeded. Not exactly 5 minutes, but connection spike(and client queue error) is happened about 5 minutes intervals. https://github.com/mongodb/mongo/blob/master/src/mongo/executor/connection_pool.cpp#L177 |
| Comments |
| Comment by Kelsey Schubert [ 09/Jun/17 ] |
|
Hi matt.lee, We've seen significant improvement for users running with Kind regards, |
| Comment by Mira Carey [ 09/Nov/16 ] |
|
I had a conversation with an engineer on our security team and he had an idea he's written down in While it won't make it for 3.4.0, it's an important change that I'll definitely be considering for a backport. |
| Comment by Mira Carey [ 08/Nov/16 ] |
|
My apologies, I hadn't updated the description of |
| Comment by 아나 하리 [ 08/Nov/16 ] |
|
Thanks Jason. So you think connection spike is expected when client authentication is enabled. And, Regards, |
| Comment by Mira Carey [ 07/Nov/16 ] |
|
Regarding the HostTimeoutMS, that represents the number of milliseconds to keep a pool alive, and min connections (default 1) open without activity. So if you're seeing spikes in this way, it would be for hosts you're not actively using. In general, it's not surprising that auth would change the cost of these kind of pool evictions (as it adds a large fixed overhead to connection establishment, by design). One pattern you could be in might be: All of this in mongos:
In the meanwhile, your best mitigation would be to keep some level of regular traffic going to those hosts. Regards, |
| Comment by 아나 하리 [ 07/Nov/16 ] |
|
Hi Thomas. I uploaded mongod and mongos log files (Not for all mongos and mongod, only one mongos and mongod server's). 18:30 ~ 19:00 (+09:00) :: auth enabled test Regards, |
| Comment by Kelsey Schubert [ 04/Nov/16 ] |
|
Hi matt.lee, I've created a secure upload portal for you to use here. Files uploaded to this portal are visible only to MongoDB employees investigating this issue and are routinely deleted after some time. Would you please let us know when you have completed the upload so we can continue to investigate this issue? We'll continue to investigate this issue here to determine whether there is an appropriate code change that would improve the behavior you are observing. Thank you, |
| Comment by 아나 하리 [ 04/Nov/16 ] |
|
Andreas Nilsson, Is there any way to send log files to you by email or some other ways. Regards, |
| Comment by Andreas Nilsson [ 03/Nov/16 ] |
|
matt.lee do you have any logs from mongos/mongod during the time of spiky connections? It's hard to say anything using only the diagnostics data. Thanks, |
| Comment by Andreas Nilsson [ 02/Nov/16 ] |
|
Hi matt.lee, sorry for not getting to this earlier. It is on our queue and we will look at the metrics file tomorrow. Best, |
| Comment by 아나 하리 [ 02/Nov/16 ] |
|
Hi... Is there anyone who look into this case. Regards, |
| Comment by 아나 하리 [ 25/Oct/16 ] |
|
We have restarted a few times during test. Thanks. |
| Comment by Ramon Fernandez Marina [ 24/Oct/16 ] |
|
matt.lee, can you please upload the logs of a mongos and a mongod during a period of spiky connections? I'd also like to take a look at the contents of the diagnostic.data from a mongod inside one of the shards. I'm looking to see where those spiky connections come from, so please choose a representative mongod; for example, if you're doing secondary reads then include logs and diagnostic data from the node that's serving those reads as well as its primary. Thanks, |