Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 2.24.1
Affects Version/s: None
Component/s: Connections, Performance
Labels:
- FY25Q4

Confidence Status:
None

Documentation Changes:
Not Needed
Documentation Changes Summary:

Hide

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?

Show
1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

Problem Description

Currently using multi-threading with more threads than Mongo connections in the pool causes a much greater number of Mongo::Error::ConnectionCheckOutTimeout errors than expected. I haven't investigated in detail, but I suspect the connection pool currently doesn't seem to serve queries in a first-in-first-out (FIFO) / round-robin fashion across threads. Some threads execute queries repeatedly, while others don't execute at all, causing Mongo::Error::ConnectionCheckOutTimeout

Why This Matters

This issue makes it nearly impossible to use multi-threaded processing with more threads than the connection pool in a production app. For example, I use a multi-threaded job for mass-sending emails. It will also cause issues with the recent async query mode added to Mongoid.

Example

Given the following:

10 second wait timeout (default)
Connection pool with 5 connections
200 threads doing queries in a loop
Average of 10ms per database query

In this case, I expect each thread will wait an average of 200 / 5 * 10ms = 400ms between query executions. I would only expect to 10-sec pool timeouts if queries are taking avg 250ms, which is a very long time for a query.

This however is not what we observe. Instead, it appears some threads are never served by the pool and get timeouts, while other threads "cut" in the queue.

Here's a Mongoid script which reproduces the issue:

# set max_pool_size to 5 in mongoid.yml
threads = 200.times.map do |i|
  Thread.new do
    j = 0
    loop do
      Mongo::QueryCache.uncached do
        Person.all.first
      rescue StandardError => e
        puts "\nThread #{i} > #{e}"
      end
      print '.' if j % 30 == 0
      j += 1
    end
  end
end
threads.each(&:join)

For clarity, here is a timing diagram with 5 connections and 10 threads that illustrates how the queuing should work:

O = actively using connection

= query finished and connection returned to pool
. = waiting in queue

Thread: Time ->
0: OO|...OO|...OO
1: OO|...OO|...OO
2: OO|...OO|...OO
3: OOO|...OO|...O
4: OOO|....OO|...
5: ...OO|...OO|..
6: ...OO|...OO|..
7: ...OO|...OO|..
8: ....OO|...OO|.
9: ....OOO|...OO|

And here is what I think may be happening:

Thread: Time ->
0: OO|OO|OO|OO|OO
1: OO|.OO|..OO|OO
2: OO|OO|OO|....O
3: OOO|...OO|..OO
4: OOO|OO|OO|OO|.
5: ...OO|...OO|.O
6: ......OO|.....
7: ..........OO|.
8: ..............
9: ..............

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

errors.png
24 kB
Jan 12 2024 12:34:02 PM UTC
image-2024-01-12-23-00-04-670.png
47 kB
Jan 12 2024 02:00:05 PM UTC
image-2024-01-12-23-00-23-624.png
54 kB
Jan 12 2024 02:00:24 PM UTC
mongo-thread-result-data.xlsx
485 kB
Jan 12 2024 02:08:31 PM UTC
success.png
30 kB
Jan 12 2024 12:33:53 PM UTC

Assignee:: Dmitry Rybakov
Reporter:: Johnny Shields
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Dec 23 2023 11:57:12 PM UTC
Updated:: Apr 28 2026 06:50:32 AM UTC
Resolved:: Apr 28 2026 06:50:33 AM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates