Disconnect takes a long time on paused Atlas cluster

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 1.1.3
    • Affects Version/s: None
    • Component/s: Connections
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      I think two things are happening
      1. We heartbeat and connect/dial with a background context which takes a long time
      on a paused cluster which doesn't refuse connections immediately.
      2. We only stop doing this based on <-done in a 50/50 chance select two times that
      comes from a disconnect so I think it's very possible to get unlucky for a long time
      over many repeated requests.
      Note: uncomment the code in connection.go that is a single select statement to see
      how disconnect speeds up.

      Attached repro

              Assignee:
              Eric Daniels (Inactive)
              Reporter:
              Eric Daniels (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: