[SERVER-19966] Excessive cursor caching in integration layer under WiredTiger Created: 14/Aug/15  Updated: 07/Dec/16  Resolved: 15/Jul/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.5
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Bruce Lucas (Inactive) Assignee: David Hows
Resolution: Done Votes: 4
Labels: WTmem, WTplaybook
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File wt-cache-spacetime.png    
Issue Links:
Duplicate
is duplicated by SERVER-19882 Excessive memory utilisation - open c... Closed
is duplicated by SERVER-22000 MongoDB version 3.2; WiredTiger is ki... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

With modest numbers of mongod client connections and collections a large number of WT cursors can be cached by the integration layer. Each cached WT cursor can consume significant memory (as much as the largest document it has handled), so this can result in very significant excess memory utilization and OOM. Repro:

1000 connections:

for t in $(seq 1000); do
    js test "load('test.js'); repro($t)" &
done
wait

each connection updates 100 collections, each with 4 indexes:

collections = 100
 
function repro(t) {
    var round = 0;
    while (true) {
        round++;
        if (t==1)
            print("=== Starting round: " + round)
        for (i=0; i<collections; i++) {
            var c = db["col" + i]
            if (round==1) {
                c.ensureIndex({a:1})
                c.ensureIndex({b:1})
                c.ensureIndex({c:1})
            }
            c.insert({a:round, b:round, c:round})
        }
        if (t==1) {
            cursors = db.serverStatus().wiredTiger.session['open cursor count']
            print('cursors', cursors)
        }
    }
}

Result is more than 500k cached WT cursors:

=== Starting round: 1
cursors 112025
=== Starting round: 2
cursors 231014
=== Starting round: 3
cursors 318091
=== Starting round: 4
cursors 375483
=== Starting round: 5
cursors 393430
=== Starting round: 6
cursors 393760
=== Starting round: 7
cursors 393950
=== Starting round: 8
cursors 393950
=== Starting round: 9
cursors 397336
=== Starting round: 10
cursors 408085
=== Starting round: 11
cursors 416445
=== Starting round: 12
cursors 443923
=== Starting round: 13
cursors 471652
=== Starting round: 14
cursors 485585
=== Starting round: 15
cursors 488255
=== Starting round: 16
cursors 495570
=== Starting round: 17
cursors 498425
=== Starting round: 18
cursors 500035



 Comments   
Comment by Alexander Gorrod [ 15/Jul/16 ]

bruce.lucas We've tried various different approaches to reducing the cursor caching done in the workload you suggested, and can't find a change that gives a better performance vs memory usage tradeoff. I intend to close the ticket - reopen if you disagree.

Comment by David Hows [ 07/Jul/16 ]

I've been thinking about this a little, and I'm not sure it makes sense to keep many cursors around for each WT table.

The current logic is something like:

  1. A given session wants a cursor
  2. Check through the cursor list in forward order fot this session and find a cursor that is open on the table we want
  3. If we find one, take it out of the cache and return it for use
  4. If we don't find one, make a new one for the given table
  5. When we are done with it, put the cursor back on the front of the queue
  6. When we put the cursor back we increment a "gen" counter and check the gen value against the oldest cursor in the table
  7. If there is more than a 10K difference between oldest and newest "gen" values we delete the last cursor in the list
  8. Recheck until the difference in gen is < 10K

As I see it, we shouldn't ever need this cache to contain more than say 1 cursor for each table as each session maintains a its own cache of cursors. As I understand it, each operation entering WT from MongoDB should be coming from a unique session at any one given time so each session should only ever expect to have one active cursor on a given collection at a time (and when a session is returned we guarantee that it has 0 cursors out). When situations arise and a single session need more cursors for a given collection, we can spawn them but we shouldn't need to keep them around.

Comment by David Hows [ 07/Jul/16 ]

Ran this on my local box with MongoDB master as at 7 Jul 2016 and the cursor counts seemed to cap out at slightly lower values than Michael reported (50k+), maxing out at 45K Cursors at 200 rounds. Seems to cap out at 50615 cursors, this is after 600 rounds and it has been static at basically this level for 300 rounds.

Comment by Alexander Gorrod [ 06/Jul/16 ]

david.hows Can you pick this ticket up please? First step will be re-running the test to confirm that it's still an issue, then reviewing the cursor caching code to understand how it operates.

From the graph michael.cahill attached above, it can be seen that the obvious solution here leads to a performance/space trade off. We need to explore that and decide what the best solution is.

Comment by Ramon Fernandez Marina [ 08/Mar/16 ]

johnnyshields, this ticket can't be closed yet, it represents a real issue that needs to be addressed.

Comment by Johnny Shields [ 08/Mar/16 ]

Are there any follow-ups on this issue or can it be closed?

Comment by Michael Cahill (Inactive) [ 17/Aug/15 ]

ramon.fernandez, as discussed, the version of that code that you and bruce.lucas@10gen.com arrived at over the weekend looks fine to me. We should fix the comment above before merging that into master.

Comment by Michael Cahill (Inactive) [ 17/Aug/15 ]

Thanks for the ticket and repro bruce.lucas@10gen.com. I experimented with a tweaked version today: I added a 4KB padding field to the documents so they have non-trivial size, but focused on 100 threads instead of 1000 (just for ease of testing).

I ran the workload against 3.0.4, 3.0.5, 3.0.5-rc1 and several variants:

  • "nocursorcache": don't cache cursors at all,
  • "noscratch": don't cache any buffers in sessions (but still cache cursors and their buffers), and
  • "sessionreset": free buffers from sessions and cursors (but keep cursors open).

All of these variants are significantly slower on this workload than stock 3.0.6-rc1 (which has 50,000 cursors cached in this test). Stock 3.0.6-rc1 is slightly faster than 3.0.5 with similar memory use.

See the attached graph. The blue values on the left axis are MB of memory used by mongod that is not accounted for in the WiredTiger cache or log buffers. The red values on the right axis are the percentage time difference for a round of the test with 3.0.5 as the baseline. All values were averaged over the last 20 rounds of runs of 200 rounds.

When considering options for reducing memory use, note that the "nocursorcache" variant has almost zero cursors open compared with 50,000+ in the other variants, and the memory saving is 30-40MB. In other words, cursors cost less than 800 bytes each on average in this test. That isn't too surprising, because the worst case of cursors buffering the largest value they have seen is unusual (cursors normally just point to data without making a copy). Aside: one case where cursors do copy data is prefix-compressed index keys, where the uncompressed keys are cached in the cursor during a scan.

Given all of that, I am not convinced that this is a blocker for 3.0.6. The situation has clearly improved since 3.0.4 and even with this worst case workload, the actual memory consumed by cached WT cursors is only a fraction of the memory used by mongod outside the WiredTiger cache. I'm sure we can improve it further, but I don't think the situation is as bad as suggested by multiplying the maximum number of cursors by the largest possible document size.

Comment by Githook User [ 14/Aug/15 ]

Author:

{u'username': u'ramonfm', u'name': u'Ramon Fernandez', u'email': u'ramon.fernandez@mongodb.com'}

Message: SERVER-19966 evict cursors unused in the last 10K operations

Remove "old" cursors from the WT session cursor cache when they haven't been
used in the last 10K operations. This change also limits the amount of cursors
in this cache to 10K, to prevent excessive resource consumption.

(cherry picked from commit 9b4d20439910450acf1385723c85f86bf41d15f0)
Branch: v3.0
https://github.com/mongodb/mongo/commit/a2fa8bfbc73994b4fd14ddc4d4f26170b958d9a7

Comment by Githook User [ 14/Aug/15 ]

Author:

{u'username': u'ramonfm', u'name': u'Ramon Fernandez', u'email': u'ramon.fernandez@mongodb.com'}

Message: SERVER-19966 evict cursors unused in the last 10K operations

Remove "old" cursors from the WT session cursor cache when they haven't been
used in the last 10K operations. This change also limits the amount of cursors
in this cache to 10K, to prevent excessive resource consumption.
Branch: master
https://github.com/mongodb/mongo/commit/9b4d20439910450acf1385723c85f86bf41d15f0

Generated at Thu Feb 08 03:52:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.