[SERVER-19966] Excessive cursor caching in integration layer under WiredTiger Created: 14/Aug/15 Updated: 07/Dec/16 Resolved: 15/Jul/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.0.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | David Hows |
| Resolution: | Done | Votes: | 4 |
| Labels: | WTmem, WTplaybook | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
With modest numbers of mongod client connections and collections a large number of WT cursors can be cached by the integration layer. Each cached WT cursor can consume significant memory (as much as the largest document it has handled), so this can result in very significant excess memory utilization and OOM. Repro: 1000 connections:
each connection updates 100 collections, each with 4 indexes:
Result is more than 500k cached WT cursors:
|
| Comments |
| Comment by Alexander Gorrod [ 15/Jul/16 ] |
|
bruce.lucas We've tried various different approaches to reducing the cursor caching done in the workload you suggested, and can't find a change that gives a better performance vs memory usage tradeoff. I intend to close the ticket - reopen if you disagree. |
| Comment by David Hows [ 07/Jul/16 ] |
|
I've been thinking about this a little, and I'm not sure it makes sense to keep many cursors around for each WT table. The current logic is something like:
As I see it, we shouldn't ever need this cache to contain more than say 1 cursor for each table as each session maintains a its own cache of cursors. As I understand it, each operation entering WT from MongoDB should be coming from a unique session at any one given time so each session should only ever expect to have one active cursor on a given collection at a time (and when a session is returned we guarantee that it has 0 cursors out). When situations arise and a single session need more cursors for a given collection, we can spawn them but we shouldn't need to keep them around. |
| Comment by David Hows [ 07/Jul/16 ] |
|
Ran this on my local box with MongoDB master as at 7 Jul 2016 and the cursor counts seemed to cap out at slightly lower values than Michael reported (50k+), maxing out at 45K Cursors at 200 rounds. Seems to cap out at 50615 cursors, this is after 600 rounds and it has been static at basically this level for 300 rounds. |
| Comment by Alexander Gorrod [ 06/Jul/16 ] |
|
david.hows Can you pick this ticket up please? First step will be re-running the test to confirm that it's still an issue, then reviewing the cursor caching code to understand how it operates. From the graph michael.cahill attached above, it can be seen that the obvious solution here leads to a performance/space trade off. We need to explore that and decide what the best solution is. |
| Comment by Ramon Fernandez Marina [ 08/Mar/16 ] |
|
johnnyshields, this ticket can't be closed yet, it represents a real issue that needs to be addressed. |
| Comment by Johnny Shields [ 08/Mar/16 ] |
|
Are there any follow-ups on this issue or can it be closed? |
| Comment by Michael Cahill (Inactive) [ 17/Aug/15 ] |
|
ramon.fernandez, as discussed, the version of that code that you and bruce.lucas@10gen.com arrived at over the weekend looks fine to me. We should fix the comment above before merging that into master. |
| Comment by Michael Cahill (Inactive) [ 17/Aug/15 ] |
|
Thanks for the ticket and repro bruce.lucas@10gen.com. I experimented with a tweaked version today: I added a 4KB padding field to the documents so they have non-trivial size, but focused on 100 threads instead of 1000 (just for ease of testing). I ran the workload against 3.0.4, 3.0.5, 3.0.5-rc1 and several variants:
All of these variants are significantly slower on this workload than stock 3.0.6-rc1 (which has 50,000 cursors cached in this test). Stock 3.0.6-rc1 is slightly faster than 3.0.5 with similar memory use. See the attached graph. The blue values on the left axis are MB of memory used by mongod that is not accounted for in the WiredTiger cache or log buffers. The red values on the right axis are the percentage time difference for a round of the test with 3.0.5 as the baseline. All values were averaged over the last 20 rounds of runs of 200 rounds. When considering options for reducing memory use, note that the "nocursorcache" variant has almost zero cursors open compared with 50,000+ in the other variants, and the memory saving is 30-40MB. In other words, cursors cost less than 800 bytes each on average in this test. That isn't too surprising, because the worst case of cursors buffering the largest value they have seen is unusual (cursors normally just point to data without making a copy). Aside: one case where cursors do copy data is prefix-compressed index keys, where the uncompressed keys are cached in the cursor during a scan. Given all of that, I am not convinced that this is a blocker for 3.0.6. The situation has clearly improved since 3.0.4 and even with this worst case workload, the actual memory consumed by cached WT cursors is only a fraction of the memory used by mongod outside the WiredTiger cache. I'm sure we can improve it further, but I don't think the situation is as bad as suggested by multiplying the maximum number of cursors by the largest possible document size. |
| Comment by Githook User [ 14/Aug/15 ] |
|
Author: {u'username': u'ramonfm', u'name': u'Ramon Fernandez', u'email': u'ramon.fernandez@mongodb.com'}Message: Remove "old" cursors from the WT session cursor cache when they haven't been (cherry picked from commit 9b4d20439910450acf1385723c85f86bf41d15f0) |
| Comment by Githook User [ 14/Aug/15 ] |
|
Author: {u'username': u'ramonfm', u'name': u'Ramon Fernandez', u'email': u'ramon.fernandez@mongodb.com'}Message: Remove "old" cursors from the WT session cursor cache when they haven't been |