[GODRIVER-1879] topology.connection TLS handshake never times out Created: 09/Feb/21 Updated: 28/Oct/23 Resolved: 08/Mar/21 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | None |
| Affects Version/s: | 1.3.7, 1.4.6 |
| Fix Version/s: | 1.4.7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Brian Fink | Assignee: | Divjot Arora (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Documentation Changes: | Not Needed |
| Description |
|
With TLS configured on a toplogy, the the connection.connect() method can hang forever. From what I can tell, timeouts are applied to all operations in that method except tls.Client.Handshake - if the remote server is up but mongod is hung, the Handshake method hangs indefinitely. We discovered this bug after noticing that the driver continues to route traffic to servers that have crashed. If a mongod exits in a way that triggers a core dump (segfault, i/o error, etc.), the core dump can take a couple of minutes to write to disk - during this time, no topology updates are triggered in the driver, heartbeats hang, and server selection still returns the bad server. This is fairly simple to repro:
|
| Comments |
| Comment by Githook User [ 08/Mar/21 ] |
|
Author: {'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}Message: |
| Comment by Githook User [ 08/Mar/21 ] |
|
Author: {'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}Message: |
| Comment by Githook User [ 08/Mar/21 ] |
|
Author: {'name': 'Divjot Arora', 'email': 'divjot.arora@10gen.com', 'username': 'divjotarora'}Message: |
| Comment by Divjot Arora (Inactive) [ 01/Mar/21 ] |
|
bfink@stripe.com Our policy is to backport onto the latest released minor version (currently 1.4.x). We've made exceptions in cases where the minor version was very new, but this will be the seventh patch release for the 1.4.x branch and the 1.3.x branch is no longer tracked by our CI system, so we feel that backporting to it would be too risky at this point. – Divjot |
| Comment by Brian Fink [ 26/Feb/21 ] |
|
Great! Is there any chance this could be backported to 1.3.x? |
| Comment by Divjot Arora (Inactive) [ 26/Feb/21 ] |
|
bfink@stripe.com I put up a new PR for this ticket: https://github.com/mongodb/mongo-go-driver/pull/594. |
| Comment by Brian Fink [ 25/Feb/21 ] |
|
Awesome, thank you! |
| Comment by Divjot Arora (Inactive) [ 25/Feb/21 ] |
|
Hi bfink@stripe.com, This ticket was incorrectly closed. I've moved it back to "Investigating" so it won't be auto-closed again. I plan on putting up a PR for this early next week so that we can include it in our upcoming 1.4.7 release. |
| Comment by Brian Fink [ 25/Feb/21 ] |
|
I think this was closed incorrectly - there has been activity on the PR linked above, and this bug still exists on master. |
| Comment by Backlog - Core Eng Program Management Team [ 25/Feb/21 ] |
|
There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information. |
| Comment by Divjot Arora (Inactive) [ 10/Feb/21 ] |
|
There's discussion about how this should be handled on this PR. Moving to "Waiting for Reporter" while we wait for a response there. |