-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 1.3.7, 1.4.6
-
Component/s: None
-
None
-
Fully Compatible
-
Not Needed
With TLS configured on a toplogy, the the connection.connect() method can hang forever. From what I can tell, timeouts are applied to all operations in that method except tls.Client.Handshake - if the remote server is up but mongod is hung, the Handshake method hangs indefinitely.
We discovered this bug after noticing that the driver continues to route traffic to servers that have crashed. If a mongod exits in a way that triggers a core dump (segfault, i/o error, etc.), the core dump can take a couple of minutes to write to disk - during this time, no topology updates are triggered in the driver, heartbeats hang, and server selection still returns the bad server.
This is fairly simple to repro:
- Start a mongod with SSL support
- Start this slightly modified version of the server_monitoring example: https://gist.github.com/bfink13/df7a72b46ce5c21ae9888ff60a36d54e
- Send a SIGSTOP to the mongod process: kill -STOP <pid of mongod>
- Observe that topology updates stop being generated
- If the mongod process is resumed (kill -CONT <pid of mongod>) or killed, topology updates resume