Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 2.21.0
Affects Version/s: None
Component/s: Connections, Load Balancer
Labels:
None

Quarter:
- FY25Q3
Case:
Confidence Status:
None

Assigned Teams:

Ruby Drivers

Documentation Changes:
Not Needed
Documentation Changes Summary:

Hide

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?

Show
1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

The load balancer spec specifies:

When the driver is in load balancing mode and executing any cursor-initiating command, the driver MUST NOT check the connection back into the pool unless the command fails or the server returns a cursor ID of 0 (i.e. all documents are returned in a single batch). Otherwise, the driver MUST continue to use the same connection for all subsequent - Code:getMore commands for the cursor. The driver MUST check the connection back into the pool if the server returns a cursor ID of 0 in a getMore response (i.e. the cursor is drained).

This is not implemented in the Ruby driver. Instead, the driver used connection pinning, and checks in the connection as usual.

This lead to the issue described below:

Thread 1 creates a cursor using Connection 1, and checks it in to the pool
Thread 2 creates a cursor using Connection 1, and checks it in to the pool
Thread 1 checks out Connection 1 for getMore
Thread 2 wants to do getMore, so tries to check out Connection 1. It is not available, though, so Thread 2 has to wait.
If Thread 1 does not return Connection 1 for some time, Thread 2 times out.

In this situation increasing maxConnecting does not help, threads are waiting for a particular connection, which can be not available.

The following example reproduces the issue locally when run against load balanced setup (see here how to run load balanced setup):

class User
  include Mongoid::Document

  field :number, type: Integer
end

1_000.times do |i|
  User.create!(number: i)
  if i % 100 == 0
    print '.'
  end
end

threads = 100.times.map do |i|
  Thread.new do
    User.where(:number.gte => i).each do |doc|
      pp doc
    end
  rescue => e
    Mongoid.logger.fatal e.inspect
  end
end

threads.each(&:join)

is duplicated by

RUBY-3528 Timed out attempting to check out a connection from pool

Closed

Assignee:: Dmitry Rybakov
Reporter:: Dmitry Rybakov
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Apr 24 2024 06:08:47 PM UTC
Updated:: Aug 21 2024 06:37:26 AM UTC
Resolved:: Aug 21 2024 06:37:26 AM UTC
Confidence Status Last Update:: 08/Aug/24 12:56 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates