Aggregation is very slow compared to full scans

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: 0.9.7
    • Component/s: Performance
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Filtering by a field that isn't indexed is extremely fast, however doing a sum over that same field is very slow when using db.group. I'm sure it has to do with the fact that aggregation takes a function and the javascript server side eval is very slow, but would it be possible to have an aggregation facility that had a similar performance to full scans, which is very fast.

      The query I am using here is:
      db.group( {ns: "testCollection", key: {}, reduce: function(obj, prev)

      { prev.csum += obj.field1; }

      , initial:

      { csum: 0 }

      })

      Sum isn't the only interesting aggregation to make fast. The most useful case here would be to have a few buckets and as we visit each item, we place one of it's fields in a bucket based to collect a histogram of stats. An example of this would be to count jira bug status codes in one aggregate

      { blocking: 5, major: 3 ... }

      .

      It just seems to me that if a non-indexed find is fast, than the aggregate case should have similar perf.

        1. mongo_log.txt
          1 kB
          John Carrino

            Assignee:
            Eliot Horowitz (Inactive)
            Reporter:
            John Carrino
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: