Apply groupByDistinct optimization to $min

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We should be able to use the $groupByDistinct/DISTINCT_SCAN optimization for $min similar to $max, $first/$last and $top/$bottom. SERVER-94159 implements this optimization for $max. However, the same approach cannot be used for $min because of potential null values in $min fields:

      For example, if we have a query like 

      {$group: {
          _id: "$a", 
          fc: {$min: "$c"}}}

      and $c contains a null value, e.g. for a=1, $c contains values 2, null, 1 across three documents. In this case, we would sort those documents in the following order: null, 1, 2 and $min would retrieve the first value resulting in null returned for that group instead of 1 even though $min should ignore the null values and return the first non-null value.

      To avoid this, we should either fall back to COLLSCAN when null values are present (given some sort of schema constraint) or come up with a way to return first non-null value instead of just the first value after the sort order is applied.

              Assignee:
              Unassigned
              Reporter:
              Sopho Kevlishvili (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: