Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-234

Aggregation is very slow compared to full scans

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • 0.9.7
    • Performance
    • None

    Description

      Filtering by a field that isn't indexed is extremely fast, however doing a sum over that same field is very slow when using db.group. I'm sure it has to do with the fact that aggregation takes a function and the javascript server side eval is very slow, but would it be possible to have an aggregation facility that had a similar performance to full scans, which is very fast.

      The query I am using here is:
      db.group( {ns: "testCollection", key: {}, reduce: function(obj, prev)

      { prev.csum += obj.field1; }

      , initial:

      { csum: 0 }

      })

      Sum isn't the only interesting aggregation to make fast. The most useful case here would be to have a few buckets and as we visit each item, we place one of it's fields in a bucket based to collect a histogram of stats. An example of this would be to count jira bug status codes in one aggregate

      { blocking: 5, major: 3 ... }

      .

      It just seems to me that if a non-indexed find is fast, than the aggregate case should have similar perf.

      Attachments

        Activity

          People

            eliot Eliot Horowitz (Inactive)
            carrino John Carrino
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: