Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-66701

Optimize $addToSet accumulators into $group stages where possible

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization

      I have seen a number of customer queries use $addToSet in scenarios that it is not necessary for and is in fact a bad choice due to it's materialization of a giant array. For example:

       [
        {
          "$match": {
            "myField.target": "TARGET"
          }
        },
        {
          "$group": {
            "_id": {},
            "value": {
              "$addToSet": "$metadata.something_id"
            }
          }
        },
        {
          "$project": {
            "_id": 0,
            "value": {
              "$size": "$value"
            }
          }
        }
      ]

      This query only needs the size of the resulting "value" array, so could be re-written like so:

      [
        {
          "$match": {
            "metadata.target": "cloud"
          }
        },
        {
          "$group": {
            "_id": "$metadata.something_id"
          }
        },
        {
           "$group": {
             "_id": {},
             value: {$sum: 1}
           }
        },
        {
          "$project": {
            "_id": 0
          }
        }
      ] 

       

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            charlie.swanson@mongodb.com Charlie Swanson
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: