-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Connections
-
None
I think two things are happening
1. We heartbeat and connect/dial with a background context which takes a long time
on a paused cluster which doesn't refuse connections immediately.
2. We only stop doing this based on <-done in a 50/50 chance select two times that
comes from a disconnect so I think it's very possible to get unlucky for a long time
over many repeated requests.
Note: uncomment the code in connection.go that is a single select statement to see
how disconnect speeds up.
Attached repro