Command takes much longer to fail than the set timeout, when waiting for a connection from the connection pool

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Cannot Reproduce
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • 0
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Failures, in time dependent tests, are happening due to timeouts not being enforced by the ASIO layer when waiting for a connection from the connection pool.

      A couple examples from the logs, taking 90 seconds with a 10 second timeout, and 180 seconds with a 30 second timeout, respectively:

      [js_test:remove2] 2016-11-10T19:01:28.944+0000 d20511| 2016-11-10T19:00:47.542+0000 I REPL     [replication-3] Restarting oplog query due to error: ExceededTimeLimit: Remote command timed out while waiting to get a connection from the pool, took 91286ms, timeout was set to 10000ms. Last fetched optime (with hash): { ts: Timestamp 1478804164000|30, t: 3 }[-9208730938153294485]. Restarts remaining: 3
      
      [js_test:remove2] 2016-11-10T19:01:28.951+0000 d20511| 2016-11-10T19:01:19.420+0000 I NETWORK  [shard registry reload] Marking host ip-10-5-194-11:20514 as failed :: caused by :: ExceededTimeLimit: Remote command timed out while waiting to get a connection from the pool, took 182264ms, timeout was set to 30000ms
      

            Assignee:
            DO NOT USE - Backlog - Platform Team
            Reporter:
            Dianna Hohensee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: