Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24375

Deduping in OR, SORT_MERGE, and IXSCAN (multikey case) uses unbounded memory

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.12
    • Component/s: None
    • Query Execution
    • ALL
    • QE 2022-10-17, QE 2022-10-31, QE 2022-11-14, QE 2022-11-28, QE 2022-12-12, QE 2022-12-26, QE 2023-01-09

      Each of the stages listed in the title keeps a set of RecordIds; these are used to identify seen documents in order to ensure that we do not return the same document twice to the user. However, this requires memory proportional to the number of documents processed, and nothing is in place to ensure that we do not consume too much. One example of how to reproduce this unbounded memory growth is given below.

      • 40 M documents of the form {x:0, y:0}
      • index on {y:1}
      • query of the following form (full repro script attached)
              q = {$or: [{x: 0, y: 0}, {x: 0, y: 0}]}
              db.c.find(q).hint({y: 1}).sort({z: -1}).limit(30).itcount()
      

      Heap profile call tree shows memory usage by OrStage::work grow to about 1.5 GB as it scans the collection, then drop back to 0 at conclusion of query. Graph in each row shows memory usage for that node and its descendants; second number in each row is max memory in MB for that node.

        1. repro.sh
          1 kB
        2. memory-growth-calltree.png
          memory-growth-calltree.png
          170 kB
        3. heap-profile.png
          heap-profile.png
          229 kB

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            5 Vote for this issue
            Watchers:
            46 Start watching this issue

              Created:
              Updated: