[SERVER-2554] Make listen backlog configurable Created: 15/Feb/11  Updated: 06/Dec/17  Resolved: 24/Aug/17

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: 1.6.5
Fix Version/s: 3.5.13

Type: Bug Priority: Major - P3
Reporter: Roger Binns Assignee: Andrew Morrow (Inactive)
Resolution: Done Votes: 4
Labels: platforms-re-triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 64 bit 10.10 using 10gen stable package


Issue Links:
Documented
is documented by DOCS-10739 Docs for SERVER-2554: Make listen bac... Closed
Related
related to SERVER-31251 Increase backlog specified in listen() Open
Backwards Compatibility: Fully Compatible
Operating System: Linux
Sprint: Platforms 2017-09-11
Participants:

 Description   

The socket listen backlog is not configurable. I have a situation where a large number of worker processes receive work notification from outside of Mongo and then all simultaneously connect to mongo get more details. They overwhelm the listen backlog and consequently some get connection refused. Most daemons let you configure the listen backlog size away from the normal OS default of 5.



 Comments   
Comment by Ramon Fernandez Marina [ 24/Aug/17 ]

Author:

{'username': u'acmorrow', 'name': u'Andrew Morrow', 'email': u'acm@mongodb.com'}

Message:SERVER-2554 Allow user configuration of the listen depth
Branch:master
https://github.com/mongodb/mongo/commit/0240a830447c54a63c5a925a96544450a08e8831

Comment by Andrew Morrow (Inactive) [ 08/Aug/17 ]

thestick613 - I see what you are saying. I'd missed that SOMAXCONN was hard coded to 128 on Linux. Given that setting a high listen value has a real cost, it seems that it really would be best to make this configurable, and have it default to SOMAXCONN. Note that any such configuration would still only be able to raise the listen limit to whatever broader limit was imposed by the system.

Comment by Tudor Aursulesei [ 08/Aug/17 ]

> uname -a
Linux mongo-rs1 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> mkdir -p /tmp/mongo
> strace -f -e trace=network mongod --dbpath /tmp/mongo/
strace: Process 252974 attached
strace: Process 252975 attached
2017-08-08T15:12:29.287+0300 I CONTROL  [initandlisten] MongoDB starting : pid=252973 port=27017 dbpath=/tmp/mongo/ 64-bit host=mongo-rs1
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] db version v3.4.6
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] git version: c55eb86ef46ee7aede3b1e2a5d184a7df4bfb5b5
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] allocator: tcmalloc
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] modules: none
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] build environment:
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten]     distmod: ubuntu1604
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten]     distarch: x86_64
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten]     target_arch: x86_64
2017-08-08T15:12:29.288+0300 I CONTROL  [initandlisten] options: { storage: { dbPath: "/tmp/mongo/" } }
[pid 252973] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
[pid 252973] setsockopt(6, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 252973] bind(6, {sa_family=AF_INET, sin_port=htons(27017), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 252973] socket(PF_LOCAL, SOCK_STREAM, 0) = 7
[pid 252973] setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 252973] bind(7, {sa_family=AF_LOCAL, sun_path="/tmp/mongodb-27017.sock"}, 110) = 0
2017-08-08T15:12:29.313+0300 I -        [initandlisten] Detected data files in /tmp/mongo/ created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-08-08T15:12:29.313+0300 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=257359M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
strace: Process 253063 attached
strace: Process 253064 attached
strace: Process 253065 attached
strace: Process 253066 attached
[pid 253066] +++ exited with 0 +++
[pid 253065] +++ exited with 0 +++
[pid 253064] +++ exited with 0 +++
[pid 253063] +++ exited with 0 +++
strace: Process 253067 attached
strace: Process 253068 attached
strace: Process 253069 attached
strace: Process 253070 attached
strace: Process 253071 attached
strace: Process 253072 attached
strace: Process 253073 attached
strace: Process 253074 attached
strace: Process 253075 attached
strace: Process 253076 attached
2017-08-08T15:12:29.826+0300 I CONTROL  [initandlisten] 
2017-08-08T15:12:29.826+0300 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-08-08T15:12:29.826+0300 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-08-08T15:12:29.826+0300 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2017-08-08T15:12:29.826+0300 I CONTROL  [initandlisten] 
2017-08-08T15:12:29.828+0300 I CONTROL  [initandlisten] 
2017-08-08T15:12:29.828+0300 I CONTROL  [initandlisten] ** WARNING: You are running on a NUMA machine.
2017-08-08T15:12:29.828+0300 I CONTROL  [initandlisten] **          We suggest launching mongod like this to avoid performance problems:
2017-08-08T15:12:29.828+0300 I CONTROL  [initandlisten] **              numactl --interleave=all mongod [other options]
2017-08-08T15:12:29.828+0300 I CONTROL  [initandlisten] 
strace: Process 253077 attached
2017-08-08T15:12:29.830+0300 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/tmp/mongo/diagnostic.data'
strace: Process 253078 attached
strace: Process 253079 attached
strace: Process 253080 attached
strace: Process 253081 attached
strace: Process 253082 attached
strace: Process 253083 attached
###############################
###############################
OVER HERE
[pid 253083] listen(6, 128)             = 0
[pid 253083] listen(7, 128)             = 0
2017-08-08T15:12:29.831+0300 I NETWORK  [thread1] waiting for connections on port 27017
###############################
###############################
^Cstrace: Process 252973 detached
strace: Process 252974 detached
strace: Process 252975 detached
strace: Process 253067 detached
strace: Process 253068 detached
strace: Process 253069 detached
2017-08-08T15:12:32.165+0300 I CONTROL  [signalProcessingThread] got signal 2 (Interrupt), will terminate after current cmd ends
strace: Process 253070 detached
strace: Process 253071 detached
strace: Process 253072 detached
strace: Process 253073 detached
strace: Process 253074 detached
strace: Process 253075 detached
2017-08-08T15:12:32.165+0300 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...
strace: Process 253076 detached
strace: Process 253077 detached
2017-08-08T15:12:32.165+0300 I NETWORK  [signalProcessingThread] closing listening socket: 6
strace: Process 253078 detached
2017-08-08T15:12:32.165+0300 I NETWORK  [signalProcessingThread] closing listening socket: 7
strace: Process 253079 detached
strace: Process 253080 detached
2017-08-08T15:12:32.165+0300 I NETWORK  [signalProcessingThread] removing socket file: /tmp/mongodb-27017.sock
strace: Process 253081 detached
strace: Process 253082 detached
strace: Process 253083 detached
2017-08-08T15:12:32.165+0300 I NETWORK  [signalProcessingThread] shutdown: going to flush diaglog...
2017-08-08T15:12:32.165+0300 I FTDC     [signalProcessingThread] Shutting down full-time diagnostic data capture
2017-08-08T15:12:32.166+0300 I STORAGE  [signalProcessingThread] WiredTigerKVEngine shutting down
2017-08-08T15:12:32.225+0300 I STORAGE  [signalProcessingThread] shutdown: removing fs lock...
2017-08-08T15:12:32.225+0300 I CONTROL  [signalProcessingThread] now exiting
2017-08-08T15:12:32.225+0300 I CONTROL  [signalProcessingThread] shutting down with code:0
2017-08-08T15:12:32.225+0300 I CONTROL  [initandlisten] shutting down with code:0

Comment by Tudor Aursulesei [ 08/Aug/17 ]

db version v3.4.6

Comment by Eric Sommer [ 08/Aug/17 ]

thestick613, which dot release of 3.4 are you currently using?

Comment by Tudor Aursulesei [ 07/Aug/17 ]

Hello, this is actually still present in 3.4, as you can see from this strace.

Comment by Andrew Morrow (Inactive) [ 07/Aug/17 ]

thestick613 - Could you please let me know what version of MongoDB you are running where you observe this? We actually believe this issue was fixed in MongoDB 3.4 as part of SERVER-26012 (see commit https://github.com/mongodb/mongo/commit/f0df1d05b6098bfc01d0351e6867f044542d3b39). That should set the listen limit as high as it can possibly go, leaving any limits as part of OS configuration (e.g. /proc/sys/net/core/somaxconn on Linux or kern.ipc.somaxconn on Darwin, both of which will silently truncate the listen limit to the os configured value). If you are still observing this behavior with MongoDB 3.4, then I'd suggest checking whether it is your OS configuration that is actually limiting the listen backlog.

Comment by Tudor Aursulesei [ 06/Aug/17 ]

This should also be merged into mongos. Port 2000 is where i run my mongos

cap> dmesg -T
 ** [out :: 10.10.3.130] [Wed Aug  2 19:30:40 2017] TCP: request_sock_TCP: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
 ** [out :: 10.10.3.131] [Wed Aug  2 19:30:17 2017] TCP: request_sock_TCP: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
 ** [out :: 10.10.3.132] [Wed Aug  2 19:31:59 2017] TCP: request_sock_TCP: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.

Comment by Tudor Aursulesei [ 21/Jul/17 ]

What is the status of this? I sometimes get

[Thu Jul 20 23:57:46 2017] TCP: request_sock_TCP: Possible SYN flooding on port 30001. Sending cookies. Check SNMP counters.

After i get this message, the mongod instance gets very unresponsive and i need to restart it to get things back to normal. I haven't yet figured out if it gets syn-flooded by a replica set or by the application.

Comment by Roger Binns [ 12/Dec/11 ]

To my knowledge there is no way to make it adaptive (ie not need configuration). You can look at the rate of incoming connections, but you can't tell when a connection initially arrived (ie how long it was in the listen queue before accept was called). You also get no notification whatsoever about the connection requests that were rejected because the queue was full.

(You could in theory see how long something was in the listen queue if you also had a network tap and looked at the TCP headers and understood the operating system behaviour with regard to TCP timestamping. However this is orders of magnitude more code and considerably more fragile than an integer option.)

Comment by Eliot Horowitz (Inactive) [ 12/Dec/11 ]

The goal is to have as few options as needed.
So if there is an intelligent way to not make it configurable, then it would be ideal
If that's not possible - then we'll make it configurable.

Comment by Roger Binns [ 12/Dec/11 ]

The limit is still a problem for me. Once more worker machines come online I'll be having the listen length issue again.

All of my worker processes (not threads due to Python GIL) talk to MongoDB. I end up with over 1,000 connections to MongoDB and all of those happen at startup virtually instantaneously (lots of forks).

I'm rather sceptical that one size will fit everyone hence a need for it to be configurable. There is a reason why virtually every UNIX daemon that will serve non-trivial loads has it as a configurable, and rarely use the OS default of 5.

As an example something else I use is gearman which takes --backlog and a default of 32. MySql takes --back_log and I don't know what the default is. I don't run postgres so I can't tell what it does, but I did find a CVS commit from last year making the listen backlog configurable.

So it is evident that other databases and similar system components had the need to make the value configurable. What makes using MongoDB so special that its default is the perfect value for everyone?

Comment by Eliot Horowitz (Inactive) [ 11/Dec/11 ]

The reason its open is to figure out if we should make it configurable.
I'm leaning towards no - but not 100% sure.

Comment by John Zwinck [ 11/Dec/11 ]

Isn't it already set to 128? I did the archaeology and found this which seems to be the original code calling listen(2) four years ago, with a backlog of 128: https://github.com/mongodb/mongo/commit/5ab9ae9d47df7fb32e6dd95b404e783020281218#grid/message.cpp

So, is this bug valid? Does it refer to something I'm not seeing?

Comment by Roger Binns [ 15/Feb/11 ]

128 would certainly have fixed the issue for me. However it is traditional that the number is configurable (command line flag) since it is unlikely one number fits all circumstances.

Comment by Eliot Horowitz (Inactive) [ 15/Feb/11 ]

We could just raise it to 128.
I think that's what mysql does for example.
Thoughts?

Generated at Thu Feb 08 03:00:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.