Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27494

Avoid unwind on multi-key index field for covered index aggregation

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 3.4.0
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Query Optimization

      I think unwind stage can be avoid when aggregation query only need index field (multi-key index field). And Also aggregation grouping or sorting stage can be solved via sorted index without quick sort processing. (I assume multi-key index has same structure with normal b-tree index.)

      We have below collections which has array field and multi-key index on that field.

      db.mkey.insert({ "_id" : 1, "apps": [ "A", "B", "C"] })
      db.mkey.insert({ "_id" : 2, "apps": ["A"] })
      db.mkey.insert({ "_id" : 3, "apps": ["A", "C"] })
      db.mkey.createIndex({apps:1})
      

      And we want to aggregate apps and count for each app.

      db.mkey.aggregate([
          {'$unwind': '$apps'},
          {'$group': { _id: '$apps', sum: { '$sum': 1 } } },
          {'$match': { sum: { '$gte': 2 } } }
        ])
      
      { "_id" : "A", "sum" : 3 }
      { "_id" : "C", "sum" : 2 }
      

      This aggregation query can't use multi-key index and need to expand (unwind) apps field of all documents and sort.
      But if this query can use multi-key index of apps field, aggregation can be processed via multi-key index only not data files and no unwind and no quick sort processing. And I think this can be much faster than full collection scan.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            sunguck.lee@gmail.com 아나 하리
            Votes:
            3 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: