Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-42662

Memory leak in 3.6.13?

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.6.13
    • Component/s: None
    • Labels:
      None
    • ALL

      Hi, we're currently trying to upgrade from 3.4.16 to 3.6.13 to hopefully improve the performance issues we've seen in 3.4 (ex SERVER-39355, SERVER-42256, SERVER-42062) but so far after 1 week running as secondary we're only seen worse cache eviction performance, and a memory leak. Here is a ticket for the memory leak issue.

      This is the comparison between our 2 secondary servers (we have two replicaset):

      yellow is 3.4.16
      green is 3.4.16 until 7/31 and then 3.6.13

      The workload didn't change of course, it's just taking more and more memory since 3.6.13.
      Here is the interesting difference between the two:

      3.4.16:

      dimelo:SECONDARY> db.serverStatus().tcmalloc.tcmalloc.formattedString
      ------------------------------------------------
      MALLOC:   108775455752 (103736.4 MiB) Bytes in use by application
      MALLOC: +   5002575872 ( 4770.8 MiB) Bytes in page heap freelist
      MALLOC: +   2222075272 ( 2119.1 MiB) Bytes in central cache freelist
      MALLOC: +       107520 (    0.1 MiB) Bytes in transfer cache freelist
      MALLOC: +    165397104 (  157.7 MiB) Bytes in thread cache freelists
      MALLOC: +    584966400 (  557.9 MiB) Bytes in malloc metadata
      MALLOC:   ------------
      MALLOC: = 116750577920 (111342.0 MiB) Actual memory used (physical + swap)
      MALLOC: +  24346746880 (23218.9 MiB) Bytes released to OS (aka unmapped)
      MALLOC:   ------------
      MALLOC: = 141097324800 (134560.9 MiB) Virtual address space used
      MALLOC:
      MALLOC:        5777478              Spans in use
      MALLOC:           2860              Thread heaps in use
      MALLOC:           4096              Tcmalloc page size
      ------------------------------------------------
      Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
      Bytes released to the OS take up virtual address space but no physical memory.
      

      3.6.13:

      dimelo_shard:SECONDARY> db.serverStatus().tcmalloc.tcmalloc.formattedString
      ------------------------------------------------
      MALLOC:   118992583216 (113480.2 MiB) Bytes in use by application
      MALLOC: +  25706909696 (24516.0 MiB) Bytes in page heap freelist
      MALLOC: +   1208287368 ( 1152.3 MiB) Bytes in central cache freelist
      MALLOC: +        56768 (    0.1 MiB) Bytes in transfer cache freelist
      MALLOC: +    862328712 (  822.4 MiB) Bytes in thread cache freelists
      MALLOC: +    648044800 (  618.0 MiB) Bytes in malloc metadata
      MALLOC:   ------------
      MALLOC: = 147418210560 (140589.0 MiB) Actual memory used (physical + swap)
      MALLOC: +  13225304064 (12612.6 MiB) Bytes released to OS (aka unmapped)
      MALLOC:   ------------
      MALLOC: = 160643514624 (153201.6 MiB) Virtual address space used
      MALLOC:
      MALLOC:        5899545              Spans in use
      MALLOC:            882              Thread heaps in use
      MALLOC:           4096              Tcmalloc page size
      ------------------------------------------------
      Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
      Bytes released to the OS take up virtual address space but no physical memory.
      

      The difference is most notably in "Bytes in page heap freelist", why aren't them released to the OS? could this also be related to SERVER-37795? could this be related to the poorer cache eviction performance we're seeing in SERVER-42256?

      Thank you.

        1. status-3.6.13.txt
          33 kB
        2. status-3.4.16.txt
          27 kB
        3. memory.png
          memory.png
          349 kB
        4. image-2019-08-07-13-16-20-281.png
          image-2019-08-07-13-16-20-281.png
          37 kB

            Assignee:
            daniel.hatcher@mongodb.com Danny Hatcher (Inactive)
            Reporter:
            bigbourin@gmail.com Adrien Jarthon
            Votes:
            3 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: