Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
3.6.13
-
None
-
None
-
ALL
Description
Hi, we're currently trying to upgrade from 3.4.16 to 3.6.13 to hopefully improve the performance issues we've seen in 3.4 (ex SERVER-39355, SERVER-42256, SERVER-42062) but so far after 1 week running as secondary we're only seen worse cache eviction performance, and a memory leak. Here is a ticket for the memory leak issue.
This is the comparison between our 2 secondary servers (we have two replicaset):
yellow is 3.4.16
green is 3.4.16 until 7/31 and then 3.6.13
The workload didn't change of course, it's just taking more and more memory since 3.6.13.
Here is the interesting difference between the two:
3.4.16:
dimelo:SECONDARY> db.serverStatus().tcmalloc.tcmalloc.formattedString
|
------------------------------------------------
|
MALLOC: 108775455752 (103736.4 MiB) Bytes in use by application
|
MALLOC: + 5002575872 ( 4770.8 MiB) Bytes in page heap freelist
|
MALLOC: + 2222075272 ( 2119.1 MiB) Bytes in central cache freelist
|
MALLOC: + 107520 ( 0.1 MiB) Bytes in transfer cache freelist
|
MALLOC: + 165397104 ( 157.7 MiB) Bytes in thread cache freelists
|
MALLOC: + 584966400 ( 557.9 MiB) Bytes in malloc metadata
|
MALLOC: ------------
|
MALLOC: = 116750577920 (111342.0 MiB) Actual memory used (physical + swap)
|
MALLOC: + 24346746880 (23218.9 MiB) Bytes released to OS (aka unmapped)
|
MALLOC: ------------
|
MALLOC: = 141097324800 (134560.9 MiB) Virtual address space used
|
MALLOC:
|
MALLOC: 5777478 Spans in use
|
MALLOC: 2860 Thread heaps in use
|
MALLOC: 4096 Tcmalloc page size
|
------------------------------------------------
|
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
|
Bytes released to the OS take up virtual address space but no physical memory.
|
3.6.13:
dimelo_shard:SECONDARY> db.serverStatus().tcmalloc.tcmalloc.formattedString
|
------------------------------------------------
|
MALLOC: 118992583216 (113480.2 MiB) Bytes in use by application
|
MALLOC: + 25706909696 (24516.0 MiB) Bytes in page heap freelist
|
MALLOC: + 1208287368 ( 1152.3 MiB) Bytes in central cache freelist
|
MALLOC: + 56768 ( 0.1 MiB) Bytes in transfer cache freelist
|
MALLOC: + 862328712 ( 822.4 MiB) Bytes in thread cache freelists
|
MALLOC: + 648044800 ( 618.0 MiB) Bytes in malloc metadata
|
MALLOC: ------------
|
MALLOC: = 147418210560 (140589.0 MiB) Actual memory used (physical + swap)
|
MALLOC: + 13225304064 (12612.6 MiB) Bytes released to OS (aka unmapped)
|
MALLOC: ------------
|
MALLOC: = 160643514624 (153201.6 MiB) Virtual address space used
|
MALLOC:
|
MALLOC: 5899545 Spans in use
|
MALLOC: 882 Thread heaps in use
|
MALLOC: 4096 Tcmalloc page size
|
------------------------------------------------
|
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
|
Bytes released to the OS take up virtual address space but no physical memory.
|
The difference is most notably in "Bytes in page heap freelist", why aren't them released to the OS? could this also be related to SERVER-37795? could this be related to the poorer cache eviction performance we're seeing in SERVER-42256?
Thank you.