Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-4283

Performance regression of parsing Array type fields

    • 5
    • 1
    • Not Needed
    • Not Needed
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      After updating MongoDB Node.js driver from v3.7.3 to v4.6.0 in our Node.js app, I've noticed a performance regression of the app's endpoints. After investigating the regression I've figured out that deserialization of documents with array type fields containing multiple elements has noticeably degraded.

      I've managed to reproduce the regression by concurrently fetching 10k documents containing array type field with 20 string elements. Here are the repositories with the benchmark that helps to reproduce the performance regression:

      https://github.com/baryshok/mongodb-fetch-arrays-driver-v3
      https://github.com/baryshok/mongodb-fetch-arrays-driver-v4

      On driver v3 the execution time of the benchmark for 300 concurrent requests takes ~30 seconds whereas on driver v4 it takes ~50 seconds:
       

       

      I took measurements for different number of concurrent requests and put them into a chart to illustrate the scales of the regression better:
       

      The measurements were taken on Node.js v14.18.3 server, but I've briefly checked it on the latest releases of v16 and v18 as well and the results correlate with the ones from v14 (they are even worse in those versions, by the way).

      The most noticeable difference in the CPU profiles for the 300 concurrent requests benchmark is that in driver v3 Garbage Collection takes ~9 seconds whereas on driver v4 it takes ~25 seconds:
       

       
      Driver v4
       

       
      Driver v3
       

       
      MongoDB Server v5.0.7
       
      I hope it's possible to figure out what's causing the Garbage Collection to take so much more time than before and improve it. Thanks!

        1. 4.6-flamgraph-capture.PNG
          4.6-flamgraph-capture.PNG
          521 kB
        2. 5.0-flamgraph-capture.PNG
          5.0-flamgraph-capture.PNG
          556 kB
        3. driver-3.7.3-fetch-10000-docs-300-times-1653993002832.cpuprofile
          551 kB
        4. driver-4.6.0-fetch-10000-docs-300-times-1653993054277.cpuprofile
          574 kB
        5. image-2022-05-31-15-52-28-389.png
          image-2022-05-31-15-52-28-389.png
          29 kB
        6. image-2022-05-31-15-53-23-777.png
          image-2022-05-31-15-53-23-777.png
          30 kB
        7. image-2022-05-31-15-54-02-280.png
          image-2022-05-31-15-54-02-280.png
          110 kB
        8. image-2022-05-31-15-57-38-591.png
          image-2022-05-31-15-57-38-591.png
          446 kB
        9. image-2022-05-31-15-59-00-904.png
          image-2022-05-31-15-59-00-904.png
          212 kB
        10. image-2022-05-31-15-59-34-443.png
          image-2022-05-31-15-59-34-443.png
          162 kB
        There are no Sub-Tasks for this issue.

            Assignee:
            warren.james@mongodb.com Warren James
            Reporter:
            baryshok@astraload.com Denis Baryshok
            Neal Beeken
            Votes:
            5 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: