[SERVER-34377] Possible memory leak Created: 08/Apr/18  Updated: 24/Apr/18  Resolved: 24/Apr/18

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tarvi Pillessaar Assignee: Donald Anderson
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

HW: AWS c4.2xlarge
OS: Ubuntu Xenila
Kernel: 4.4.0-92-generic


Attachments: PNG File Screen Shot 2018-04-08 at 20.22.20.png     File diagnostic.data.tar     PNG File increase.png    
Issue Links:
Related
related to WT-4052 Free transaction resources on session... Closed
Operating System: ALL
Sprint: Storage Non-NYC 2018-05-07
Participants:

 Description   

After upgrading one of our mongodb cluster to 3.6.3 I noticed that mongodb is leaking memory over time or something (maybe it was leaking also in 3.4)
wiredTiger stats reports only 5GB of cache used, but according to ps mongod is using more than 12GB

I haven't restarted this host yet, I can collect more information if needed.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mongodb   3021 97.1 84.3 21641472 12996084 ?   Ssl  Feb27 56667:27 /usr/bin/mongod --config /etc/mongod.conf

wiredTiger

db.serverStatus().wiredTiger.cache
{
	"application threads page read from disk to cache count" : 2657042,
	"application threads page read from disk to cache time (usecs)" : 388085328,
	"application threads page write from cache to disk count" : 266761274,
	"application threads page write from cache to disk time (usecs)" : 2414299779,
	"bytes belonging to page images in the cache" : 3403211247,
	"bytes belonging to the lookaside table in the cache" : 182,
	"bytes currently in the cache" : 5435745914,
	"bytes not belonging to page images in the cache" : 2032534667,
	"bytes read into cache" : 47026988836,
	"bytes written from cache" : NumberLong("3776856175328"),
	"checkpoint blocked page eviction" : 1708,
	"eviction calls to get a page" : 32023358,
	"eviction calls to get a page found queue empty" : 2515062,
	"eviction calls to get a page found queue empty after locking" : 509387,
	"eviction currently operating in aggressive mode" : 0,
	"eviction empty score" : 0,
	"eviction passes of a file" : 1883881,
	"eviction server candidate queue empty when topping up" : 85529,
	"eviction server candidate queue not empty when topping up" : 859553,
	"eviction server evicting pages" : 0,
	"eviction server slept, because we did not make progress with eviction" : 31345553,
	"eviction server unable to reach eviction goal" : 0,
	"eviction state" : 32,
	"eviction walk target pages histogram - 0-9" : 129324,
	"eviction walk target pages histogram - 10-31" : 795080,
	"eviction walk target pages histogram - 128 and higher" : 0,
	"eviction walk target pages histogram - 32-63" : 126038,
	"eviction walk target pages histogram - 64-128" : 833439,
	"eviction walks abandoned" : 252649,
	"eviction walks gave up because they restarted their walk twice" : 286047,
	"eviction walks gave up because they saw too many pages and found no candidates" : 17506,
	"eviction walks gave up because they saw too many pages and found too few candidates" : 32906,
	"eviction walks reached end of tree" : 914012,
	"eviction walks started from root of tree" : 580255,
	"eviction walks started from saved location in tree" : 1303626,
	"eviction worker thread active" : 4,
	"eviction worker thread created" : 0,
	"eviction worker thread evicting pages" : 28921953,
	"eviction worker thread removed" : 0,
	"eviction worker thread stable number" : 0,
	"failed eviction of pages that exceeded the in-memory maximum count" : 35487,
	"failed eviction of pages that exceeded the in-memory maximum time (usecs)" : 517310,
	"files with active eviction walks" : 0,
	"files with new eviction walks started" : 627965,
	"force re-tuning of eviction workers once in a while" : 0,
	"hazard pointer blocked page eviction" : 57852,
	"hazard pointer check calls" : 33233880,
	"hazard pointer check entries walked" : 190015256,
	"hazard pointer maximum array length" : 3,
	"in-memory page passed criteria to be split" : 235698,
	"in-memory page splits" : 86165,
	"internal pages evicted" : 8326,
	"internal pages split during eviction" : 61,
	"leaf pages split during eviction" : 112849,
	"lookaside score" : 0,
	"lookaside table entries" : 0,
	"lookaside table insert calls" : 0,
	"lookaside table remove calls" : 0,
	"maximum bytes configured" : 7347372032,
	"maximum page size at eviction" : 0,
	"modified pages evicted" : 28692743,
	"modified pages evicted by application threads" : 0,
	"overflow pages read into cache" : 0,
	"page split during eviction deepened the tree" : 0,
	"page written requiring lookaside records" : 0,
	"pages currently held in the cache" : 176311,
	"pages evicted because they exceeded the in-memory maximum count" : 105360,
	"pages evicted because they exceeded the in-memory maximum time (usecs)" : 965601663,
	"pages evicted because they had chains of deleted items count" : 145901,
	"pages evicted because they had chains of deleted items time (usecs)" : 293750697,
	"pages evicted by application threads" : 0,
	"pages queued for eviction" : 94503526,
	"pages queued for urgent eviction" : 302231,
	"pages queued for urgent eviction during walk" : 16993,
	"pages read into cache" : 2660205,
	"pages read into cache requiring lookaside entries" : 0,
	"pages read into cache skipping older lookaside entries" : 0,
	"pages read into cache with skipped lookaside entries needed later" : 0,
	"pages requested from the cache" : 444638466854,
	"pages seen by eviction walk" : 675910019,
	"pages selected for eviction unable to be evicted" : 103201,
	"pages walked for eviction" : 866578969,
	"pages written from cache" : 295217700,
	"pages written requiring in-memory restoration" : 886952,
	"percentage overhead" : 8,
	"tracked bytes belonging to internal pages in the cache" : 17646012,
	"tracked bytes belonging to leaf pages in the cache" : 5418099902,
	"tracked dirty bytes in the cache" : 29123058,
	"tracked dirty pages in the cache" : 74,
	"unmodified pages evicted" : 4344899
}



 Comments   
Comment by Tarvi Pillessaar [ 24/Apr/18 ]

Thank you. I will test it when it gets into 3.6 release.

Comment by Donald Anderson [ 24/Apr/18 ]

We believe the cause has been identified as WT-4052, so I'm closing this as a duplicate. tarvip, when this gets into a 3.6 release, could you test it out then and to make sure it solves the problem? Thanks!

Comment by Donald Anderson [ 23/Apr/18 ]

From a conversation with keith.bostic about this:
Q: Why does this show up with just truncate, any modify should have the same effect, right?
A: A typical point-update hits N records, a truncate hits many.

Comment by Donald Anderson [ 23/Apr/18 ]

alexander.gorrod, I just put a conditional print in __txn_next_op where the realloc happens to see if txn->mod_alloc ever gets big. With that, I ran test_truncate01.py with scenario 0 - that forces only the 'simple' data set to be used, no WT indices. I see it with other scenarios too. For these tests, the highest I see txn->mod_alloc getting up to is 122880. Here's the stack when that happens:

  • frame #0: 0x000000010407af5c libwiredtiger-3.0.1.dylib`__txn_next_op(session=0x0000000100784598, opp=0x00007fff5fbfc3d8) at txn.i:189
    frame #1: 0x0000000104079c65 libwiredtiger-3.0.1.dylib`__wt_txn_modify(session=0x0000000100784598, upd=0x0000000102256270) at txn.i:235
    frame #2: 0x000000010407954b libwiredtiger-3.0.1.dylib`__wt_col_modify(session=0x0000000100784598, cbt=0x0000000102235500, recno=641, value=0x0000000102235610, upd_arg=0x0000000000000000, modify_type=5, exclusive=false) at col_modify.c:162
    frame #3: 0x000000010402d2eb libwiredtiger-3.0.1.dylib`__cursor_col_modify(session=0x0000000100784598, cbt=0x0000000102235500, modify_type=5) at bt_cursor.c:429
    frame #4: 0x000000010402fe4e libwiredtiger-3.0.1.dylib`_cursor_truncate(session=0x0000000100784598, start=0x0000000102235500, stop=0x00000001022351f0, rmfunc=(libwiredtiger-3.0.1.dylib`_cursor_col_modify at bt_cursor.c:428)) at bt_cursor.c:1601
    frame #5: 0x000000010402fa89 libwiredtiger-3.0.1.dylib`__wt_btcur_range_truncate(start=0x0000000102235500, stop=0x00000001022351f0) at bt_cursor.c:1712
    frame #6: 0x00000001041876a8 libwiredtiger-3.0.1.dylib`__wt_schema_range_truncate(session=0x0000000100784598, start=0x0000000102235500, stop=0x00000001022351f0) at schema_truncate.c:145
    frame #7: 0x0000000104189f6c libwiredtiger-3.0.1.dylib`__wt_session_range_truncate(session=0x0000000100784598, uri=0x0000000000000000, start=0x0000000102235500, stop=0x00000001022351f0) at session_api.c:1276
    frame #8: 0x0000000104193a4b libwiredtiger-3.0.1.dylib`__session_truncate(wt_session=0x0000000100784598, uri=0x0000000000000000, start=0x0000000000000000, stop=0x00000001022351f0, config=0x0000000000000000) at session_api.c:1347

This is running the WT from the MongoDB 3.6.3 drop. txn.i has been modified in this tree, so ignore the line numbers in the first couple frames.

Notable in the stack is this:

frame #4: 0x000000010402fe4e libwiredtiger-3.0.1.dylib`_cursor_truncate(session=0x0000000100784598, start=0x0000000102235500, stop=0x00000001022351f0, rmfunc=(libwiredtiger-3.0.1.dylib`_cursor_col_modify at bt_cursor.c:428)) at bt_cursor.c:1601
1598 F_MASK((WT_CURSOR *)start, WT_CURSTD_KEY_SET) == WT_CURSTD_KEY_INT);
1599
1600 for (;;) {
-> 1601 if ((ret = rmfunc(session, start, WT_UPDATE_TOMBSTONE)) != 0)
1602 break;
1603
1604 if (stop != NULL && __cursor_equals(start, stop))

So the presumption is that a truncate will create a lot of txn ops, growing the txn->mod array, which effectively won't be freed.

122880 * 100 bytes = 12M, and that's for a test with 1000 entries. Eventually, lots of sessions can build up these numbers.

Comment by Keith Bostic (Inactive) [ 23/Apr/18 ]

donald.anderson, the path you point out is interesting because it might explain the significance of this being a truncate operation, we do slightly different work underneath the update when it's a truncate operation. I don't see anything that's not being freed, but it might be worth a second look.

Comment by Keith Bostic (Inactive) [ 23/Apr/18 ]

alexander.gorrod, I think donald.anderson raises an interesting point, should we be calling __wt_txn_destroy from __wt_session_release_resources, that is, freeing the snapshot and the list of modifications when resetting a session?

The snapshot is relatively large, 8B for each possible session.

Comment by Keith Bostic (Inactive) [ 20/Apr/18 ]

Looking at the stack, once we get into __wt_row_modify(), we're in the same update code that everything else uses, that is, an update by a truncate is no different than an update for any other reason. It's hard to imagine that truncate is leaking memory while other updates aren't.

Comment by Keith Bostic (Inactive) [ 20/Apr/18 ]

donald.anderson

The memory allocated as part of WT_PAGE_ALLOC_AND_SWAP won't be freed until the page is discarded, here. I don't think there's any adjustment when it's freed. It's a relatively large allocation (a slot per key on the page), and so it's not adjusted based on WT_UPDATE_MEMSIZE.

Comment by Tarvi Pillessaar [ 13/Apr/18 ]

Hi,

I have uploaded log and diagnostic.data. At the moment WT reports:
"bytes currently in the cache" : 5697136640
and ps:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mongodb  11465  110 59.6 16439420 9185720 ?    Ssl  Apr11 3500:54 /usr/bin/mongod --config /etc/mongod.conf

this means more than 3GB outside the WT cache.

Thank you,
Tarvi

Comment by Bruce Lucas (Inactive) [ 10/Apr/18 ]

Hi Tarvi,

It looks like mongod has allocated about 7 GB of data outside the WT cache, which should not happen.

To help us investigate, can you please restart mongod with --setParameter heapProfilingEnabled=true. Then after a couple days of runtime please upload

  • the complete mongod log files since the restart. (You can look for heap profiler messages in the log file to verify that it is running correctly.)
  • diagnostic.data covering the entire time since the restart

This will record information that should enable us to track the source of the memory increase. For this purpose it is important that we have complete logs and diagnostic.data covering the time since the restart.

Since the required files may be too large to attach to this ticket, I've generated a secure upload portal for you.

This may be a slow steady leak, so after a uploading the initial couple days of data, please leave the instance running in that state in case we need a longer time period to get a clear signal.

Thanks,
Bruce

Comment by Tarvi Pillessaar [ 09/Apr/18 ]

I analyzed one week of mongodb log, this is what I found:

    100  name: "mongoc", version: "1.9.3"
    659  name: "MongoDB Internal Client", version: "3.4.10"
  32482  name: "MongoDB Internal Client", version: "3.6.3"
   3097  name: "NetworkInterfaceASIO-RS", version: "3.6.3"
      5  name: "nodejs", version: "2.2.31"
     36  name: "nodejs", version: "2.2.33"
     16  name: "nodejs", version: "2.2.34"
  76807  name: "nodejs", version: "3.0.2"
 263701  name: "PyMongo", version: "3.4.0"
    158  name: "PyMongo", version: "3.5.1"
     12  name: "PyMongo", version: "3.6.0"
     81  name: "PyMongo", version: "3.6.1"

First number shows connection count.

Our main application is using nodejs.

pymongo 3.4.0 is used by some checks scripts and stats collections tools, these things basically connect run few queries and then disconnect.

Comment by Daniel Pasette (Inactive) [ 08/Apr/18 ]

Can you say which client driver and version you are using?

Generated at Thu Feb 08 04:36:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.