-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Connections, Load Balancer
-
None
The load balancer spec specifies:
When the driver is in load balancing mode and executing any cursor-initiating command, the driver MUST NOT check the connection back into the pool unless the command fails or the server returns a cursor ID of 0 (i.e. all documents are returned in a single batch). Otherwise, the driver MUST continue to use the same connection for all subsequent - Code:getMore commands for the cursor. The driver MUST check the connection back into the pool if the server returns a cursor ID of 0 in a getMore response (i.e. the cursor is drained).
This is not implemented in the Ruby driver. Instead, the driver used connection pinning, and checks in the connection as usual.
This lead to the issue described below:
- Thread 1 creates a cursor using Connection 1, and checks it in to the pool
- Thread 2 creates a cursor using Connection 1, and checks it in to the pool
- Thread 1 checks out Connection 1 for getMore
- Thread 2 wants to do getMore, so tries to check out Connection 1. It is not available, though, so Thread 2 has to wait.
- If Thread 1 does not return Connection 1 for some time, Thread 2 times out.
In this situation increasing maxConnecting does not help, threads are waiting for a particular connection, which can be not available.
The following example reproduces the issue locally when run against load balanced setup (see here how to run load balanced setup):
class User include Mongoid::Document field :number, type: Integer end 1_000.times do |i| User.create!(number: i) if i % 100 == 0 print '.' end end threads = 100.times.map do |i| Thread.new do User.where(:number.gte => i).each do |doc| pp doc end rescue => e Mongoid.logger.fatal e.inspect end end threads.each(&:join)
- is duplicated by
-
RUBY-3528 Timed out attempting to check out a connection from pool
- Closed