Uploaded image for project: 'Ruby Driver'
  1. Ruby Driver
  2. RUBY-347

mongodb-ruby-driver causes a variable/higher number of connections with performance impact

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 1.4.0, 1.4.1
    • Fix Version/s: 1.5.0
    • Component/s: None
    • Labels:
      None
    • Environment:
      MongoHQ, semi-dedicated environment, 2 replicas and an arbiter (orchid)
      Ruby 1.9.2, mongo/bson/bson_ext 1.4.1, Mongoid 2.0.2
    • Backwards Compatibility:
      Major Change

      Description

      The short story is that upgrading to 1.4.0 and then 1.4.1 made our production environment (almost) toast.

      • new-relic.png: shows the query performance right after deployment w/ 1.4.0 and then 1.4.1 until the problem went away, downgrade to 1.3.1
      • mongohq-conncount.png: shows the number of connections from the rails app to mongo varying significantly up to 77, downgrading to 1.3.1 put it back in a stable number of 11
      • mongostat.png shows nothing unusual while queries timeout from ruby

      a random sampler

      30.times { puts Benchmark.realtime

      { Mongoid.master.connection.active? }

      ; sleep(1) }) which executes db.runCommand(

      { ping: 1 }

      ).

      0.024342775344848633
      0.08080220222473145
      2.113878011703491 <-------- not ok
      0.023059368133544922
      0.03187060356140137

      • at the same time we're experiencing timeouts between replicas, but with 1.3.1 it doesn't affect performance, mongodb log

      Fri Oct 21 20:50:01 [ReplSetHealthPollTask] EINTR retry
      Fri Oct 21 20:50:01 [ReplSetHealthPollTask] DBClientCursor::init call() failed
      Fri Oct 21 20:50:01 [ReplSetHealthPollTask] replSet info arbiter0.orchid.mongohq.com:10001 is down (or slow to respond): DBClientBase::findOne: transport error: arbiter0.orchid.mongohq.com:10001 query:

      { replSetHeartbeat: "orchid_1", v: 3, pv: 1, checkEmpty: false, from: "node0.orchid.mongohq.com:10001" }

      Fri Oct 21 20:50:05 [ReplSetHealthPollTask] replSet info arbiter0.orchid.mongohq.com:10001 is up

        Attachments

        1. mongohq-conncount.png
          mongohq-conncount.png
          31 kB
        2. mongostat.png
          mongostat.png
          89 kB
        3. new-relic.png
          new-relic.png
          23 kB

          Activity

            People

            • Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: