Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-22910

mongos keeps bad connections around to downed hosts

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Gone away
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Networking, Sharding
    • Labels:
      None
    • Operating System:
      ALL

      Description

      Due to the split between the legacy connection pool, and the various NetworkInterfaceASIO connection pools, information about network errors is not fully exploited.

      Consider the following scenario:

      In a sharded cluster, one shard is restarted. A client runs a find command against mongos, which fails as a bad connection is used. Mongos then correctly dumps all the connections it has to the shard. The client retries the find and it works.

      However, if the client then runs a 'count command' it will then fail, since bad connections to the downed shard are still present in the legacy connection pool.

      The fix here is to drop all pooled connections to a bad host in ALL pools when a network error is detected.

      I have also attached a jstest that reproduces the problem.

        Attachments

          Activity

            People

            Assignee:
            backlog-server-servicearch Backlog - Service Architecture
            Reporter:
            adam.midvidy Adam Midvidy
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: