Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59320

Use DISTINCT_SCAN on multikey indexes in special cases

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Query Optimization

      Given an index on {X:1, Y:1, Z:1, D:1} and a distinct() query which has filters on X, Y, and Z, and requests a distinct on D, we currently will not use DISTINCT_SCAN if D is multikey. SERVER-28952 explains why this was done.

       

      I believe there is a special case where we can actually use DISTINCT_SCAN when D is multikey:

      -D has  [MinKey, MaxKey] bounds.

      AND one of the following is true:

      1) None of X, Y,Z share a multikey path prefix with D.

      2) All of X,Y,Z which do share a multikey path prefix with D have [MinKey,MaxKey] bounds.

       

      For example, assuming D has  [MinKey, MaxKey] bounds:

      If X is 'a.b' and D is 'c.d' then the optimization can be done.

      If X is 'a.b' and D is 'a.c' and 'a' is not multikey, the optimization can be done.

      If X is 'a.b' and D is 'a.c' and 'a' is multikey and the bounds on 'a.b' are not [MinKey, MaxKey] the optimization cannot be done.

       

       

      These are the same conditions we use for determining whether a multikey index can provide a sort. See this code for a more detailed explanation.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: