-
Type: New Feature
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 1.12.5, 2.4.1
-
Component/s: None
-
None
-
Minor Change
We deploy applications in Docker containers on DC/OS using virtual machines provisioned using OpenStack. We are not utilizing floating IPs (FIPs) on our DC/OS private agents, and as a result we are using a network address translation (NAT) via OpenStack's networking virtualization layer, MidoNet. When our application establishes a mongo replica set connection, MidoNet closes the connect after 60 seconds of no network traffic. The result of this is a Mongo::OperationTimeout, that we've observed is caused by a connection close issued by the Mongo server when attempting to use a connection pool (whose connection has been closed by MidoNet).
We tried a few things to configure a TCP keep-alive at the operating system layer, but the container did not appear to inherit these settings. It was also discovered that the ruby Mongo library is not implementing a keep-alive, and does not provide a hook to manually configure. We are testing a implementation with a forked version of the ruby library, adding the following line to Mongo::TCPSocket#handle_connect:
sock.setsockopt(Socket::SOL_SOCKET, Socket::SO_KEEPALIVE, true)
Initial findings are positive - in the attached screenshot you can see that the time spent in the network layer has been at least ~50ms (as high as 320ms) and is now seen as < 50ms.
We are on an older version (1.12.5), but also noticed that this configuration in not available in the latest version.
- is related to
-
RUBY-1799 Connection options for tcp_keepalive_* not exposed to Mongo::Client
- Closed
-
DRIVERS-383 Enable and configure TCP Keepalive by default
- Closed
- related to
-
RUBY-1283 Enable and configure TCP Keepalive by default
- Closed