Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40625

Open File Descriptor Regression

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Gone away
    • Affects Version/s: 3.6.10, 3.6.11
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      ~200 DBs in each mongodb cluster

      For each database, there are ~40 collections, performing mix all ops below:

      • query
      • update
      • delete
      • insert
      • drop collections
      • create collections
      Show
      ~200 DBs in each mongodb cluster For each database, there are ~40 collections, performing mix all ops below: query update delete insert drop collections create collections

      Description

      Comparing v3.4.17 and v3.6.11

      As our team is planning to upgrade mongodb from v3.4 to v3.6, we ran performance test to compare behavior between the two versions. We have 2 mongodb clusters in v3.4.17 and 2 mongodb clusters in v3.6.11, running the same workload across the 4 mongodb clusters. (Our workload is consist of only background jobs, processing same amount of data and operations)

       

      Problem

      **From our comparisons, we see significant regression in # of open file descriptors. There seems to be a clear trend, rapid leak rate of file descriptor in v3.6.11. (See comparison below - for # open file descriptors on mongodb primaries)

      Background

      We are paying extra attention to file descriptor because it has been one of the main issues in our mongodb clusters. We had been having slow rate of file descriptor leak in v3.4, which results in memory leak, causing cache to go over 80% after long period of time, and performance degrades over time. With the help from mongodb onsite support, we found that we had file descriptor leak because our workload drop collections and create new collections often, causing the leak in session cache. We requested https://jira.mongodb.org/browse/SERVER-38779 to be backport-ed to v3.6.11. However, even with this fix, we are still seeing significantly higher rate of file descriptor leak in v3.6.11 comparing to v3.4.17 on the same workload. We are suspecting that other bugs are causing file descriptor leak in mongodb v3.6.

        Attachments

        1. Screen Shot 2019-04-30 at 4.55.25 PM.png
          Screen Shot 2019-04-30 at 4.55.25 PM.png
          701 kB
        2. Screen Shot 2019-04-19 at 4.51.50 PM.png
          Screen Shot 2019-04-19 at 4.51.50 PM.png
          175 kB
        3. Screen Shot 2019-04-19 at 4.51.50 PM.png
          Screen Shot 2019-04-19 at 4.51.50 PM.png
          175 kB
        4. Screen Shot 2019-04-19 at 4.51.20 PM.png
          Screen Shot 2019-04-19 at 4.51.20 PM.png
          169 kB
        5. Screen Shot 2019-04-12 at 11.21.01 AM.png
          Screen Shot 2019-04-12 at 11.21.01 AM.png
          615 kB
        6. OpenFD.png
          OpenFD.png
          334 kB
        7. LatencyTest.png
          LatencyTest.png
          419 kB
        8. CacheUsage.png
          CacheUsage.png
          460 kB
        9. 3.6.png
          3.6.png
          110 kB
        10. 3.4.png
          3.4.png
          103 kB

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: