[SERVER-32916] mongos creates lots of connections to nongod nodes Created: 26/Jan/18  Updated: 22/Feb/18  Resolved: 29/Jan/18

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: shawn Assignee: Mark Agarunov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File metrics.tar.gz     File mongod.log.tar.gz     File mongos.log.tar.gz    
Participants:

 Description   

HI

we ran mongodb sharded cluster for several weeks, it works well.
but we found mongos would creates lots of connections to primary nods in a shard set.
when this occured, the respone time of this cluster will increase sharply
and bigger than the number we can accepted.

we found some typical statement from logs, as following:

logs from mongos log:

2018-01-25T22:10:00.006+0800 I ASIO [NetworkInterfaceASIO-TaskExecutorPool-8-0] Connecting to 10.136.142.35:28000
2018-01-25T22:10:00.008+0800 I ACCESS [conn2318094] Successfully authenticated as principal useeeeeeeer on admin
2018-01-25T22:10:00.013+0800 I ASIO [NetworkInterfaceASIO-TaskExecutorPool-1-0] Successfully connected to 10.136.180.52:28000, took 10111ms (10 connections now open to 10.136.180.52:
28000)
2018-01-25T22:10:00.013+0800 I ASIO [NetworkInterfaceASIO-TaskExecutorPool-12-0] Failed to connect to 10.136.179.52:28000 - HostUnreachable: End of file
2018-01-25T22:10:00.013+0800 I ASIO [NetworkInterfaceASIO-TaskExecutorPool-12-0] Failed to close stream: Transport endpoint is not connected
2018-01-25T22:10:00.014+0800 I ASIO [NetworkInterfaceASIO-TaskExecutorPool-8-0] Connecting to 10.136.5.44:28000

logs from mongod:
2018-01-25T22:10:00.002+0800 I NETWORK [thread2] connection accepted from 10.136.180.33:30285 #7564018 (10233 connections now open)
2018-01-25T22:10:00.006+0800 I NETWORK [thread2] connection refused because too many open connections: 10240

notes: 1. we sets the maxincomming conections of mongod is 10240.
2. when the cluster goes into usual status, the conections of mongod is about several hunderds.

we could offer the detail logs and ftdc logs so as to find the root cause.

Thanks.



 Comments   
Comment by Mark Agarunov [ 29/Jan/18 ]

Hello shawn001,

Thank you for providing this information. Looking over the logs and diagnostic data, this appears to be caused by the rapid increase in connections. My recommendation would be adjusting the connection pool settings as described in SERVER-25027 to better match your specific workload. If you are still seeing this issue after making these adjustments, please let me know and we will continue the investigation.

Thanks,
Mark

Comment by shawn [ 29/Jan/18 ]

Hello Mark

I have uploaded theses files you need.

Hoping for the root cause.

Thanks.

Comment by Mark Agarunov [ 26/Jan/18 ]

Hello shawn001,

Thank you for the report. To get a better idea of why you may be seeing issue, could you please provide the complete logs from all affected mongod and mongos nodes, as well as an archive (tar or zip) of the $dbpath/diagnostic.data directory?

Thanks,
Mark

Generated at Thu Feb 08 04:31:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.