[GODRIVER-1879] topology.connection TLS handshake never times out Created: 09/Feb/21  Updated: 28/Oct/23  Resolved: 08/Mar/21

Status: Closed
Project: Go Driver
Component/s: None
Affects Version/s: 1.3.7, 1.4.6
Fix Version/s: 1.4.7

Type: Bug Priority: Major - P3
Reporter: Brian Fink Assignee: Divjot Arora (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Documentation Changes: Not Needed

 Description   

With TLS configured on a toplogy, the the connection.connect() method can hang forever. From what I can tell, timeouts are applied to all operations in that method except tls.Client.Handshake - if the remote server is up but mongod is hung, the Handshake method hangs indefinitely.

We discovered this bug after noticing that the driver continues to route traffic to servers that have crashed. If a mongod exits in a way that triggers a core dump (segfault, i/o error, etc.), the core dump can take a couple of minutes to write to disk - during this time, no topology updates are triggered in the driver, heartbeats hang, and server selection still returns the bad server.

This is fairly simple to repro:



 Comments   
Comment by Githook User [ 08/Mar/21 ]

Author:

{'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}

Message: GODRIVER-1879 Apply connectTimeoutMS to TLS handshake (#594)
Branch: release/1.4
https://github.com/mongodb/mongo-go-driver/commit/3cf67b94f793d2a1fd062b8d56b730b4d7c4b7e9

Comment by Githook User [ 08/Mar/21 ]

Author:

{'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}

Message: GODRIVER-1879 Apply connectTimeoutMS to TLS handshake (#594)
Branch: release/1.5
https://github.com/mongodb/mongo-go-driver/commit/2a5f9a4fa2c39e810a954a2d68757a81bc4ed8c1

Comment by Githook User [ 08/Mar/21 ]

Author:

{'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}

Message: GODRIVER-1879 Apply connectTimeoutMS to TLS handshake (#594)
Branch: master
https://github.com/mongodb/mongo-go-driver/commit/5c0f679db9314d18c64c2f96f9c4c23ac867975e

Comment by Divjot Arora (Inactive) [ 01/Mar/21 ]

bfink@stripe.com Our policy is to backport onto the latest released minor version (currently 1.4.x). We've made exceptions in cases where the minor version was very new, but this will be the seventh patch release for the 1.4.x branch and the 1.3.x branch is no longer tracked by our CI system, so we feel that backporting to it would be too risky at this point.

– Divjot

Comment by Brian Fink [ 26/Feb/21 ]

Great!

Is there any chance this could be backported to 1.3.x?

Comment by Divjot Arora (Inactive) [ 26/Feb/21 ]

bfink@stripe.com I put up a new PR for this ticket: https://github.com/mongodb/mongo-go-driver/pull/594.

Comment by Brian Fink [ 25/Feb/21 ]

Awesome, thank you!

Comment by Divjot Arora (Inactive) [ 25/Feb/21 ]

Hi bfink@stripe.com,

This ticket was incorrectly closed. I've moved it back to "Investigating" so it won't be auto-closed again. I plan on putting up a PR for this early next week so that we can include it in our upcoming 1.4.7 release.

Comment by Brian Fink [ 25/Feb/21 ]

I think this was closed incorrectly - there has been activity on the PR linked above, and this bug still exists on master.

Comment by Backlog - Core Eng Program Management Team [ 25/Feb/21 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information.

Comment by Divjot Arora (Inactive) [ 10/Feb/21 ]

There's discussion about how this should be handled on this PR. Moving to "Waiting for Reporter" while we wait for a response there.

Generated at Thu Feb 08 08:37:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.