Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15291

slow '$group' performance

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.1
    • Component/s: Performance
    • Labels:
      None

      Seems that some $group queries that can be made using index-only scans, are performing unnecessary full table scans.

      Here are more details, As I originally described in http://stackoverflow.com/questions/24980525/mongodb-slow-group-performance.

      I have a MongoDB collection of over 1,000,000 records. Each record size is around 20K (so the total collection size is around 20GB).

      I have a 'type' field in the collection (that can have around 10 different values). I would like to get the per-type counters for the collection. Also, there is an index on the 'type' field.

      I've tested two different approaches (assume python syntax):

      • A naive method - using 'count' calls for each of the values:
        for type_val in my_db.my_colc.distinct('type'):
            counters[type_val] = my_db.my_colc.find({'type' : type_val}).count()
        
      • Using aggregation framework with a '$group' syntax:
        counters = my_db.my_colc.aggregate([{'$group' :  {'_id': '$type', 'agg_val': { '$sum': 1 } }}])
        

      The performance I'm receiving for the first approach is about 2 orders of magnitude faster than for the 2nd approach. Seems to be related to the fact that count runs on the indices only, without accessing the documents, while $group has to go over the documents one-by-one. (It's about 1min vs. 45mins).

      Is there any way to run an efficient grouping query on the 'type' index, that would use only the index, thus achieving the performance results from #1, but using the aggregation framework ?

            Assignee:
            Unassigned Unassigned
            Reporter:
            baruchoxman Baruch Oxman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: