Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-74764

[CQF] Improve heuristics to guide fetch and index intersection

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • QO 2023-05-29, QO 2023-06-12, QO 2023-06-26, QO 2023-07-10, QO 2023-07-24, QO 2023-08-07, QO 2023-08-21, QO 2023-09-04, QO 2023-09-18, QO 2023-10-02, QO 2023-10-16, QO 2023-10-30, QO 2023-11-13

      Given N conjuncts (predicates) in a Sargable node, and in the absence of heuristics, the optimizer will explore upto 2^N ways to split the predicates into "left" and "right" sides. Those splits are used to support both Index+Fetch and Index+Index intersection rewrites.

      Currently there exists a heuristic aiming to reduce the number of plans explored. It works in the following way. For a given collection, we keep track of the top-level fields used in any of its indexes. For example for an index set of (a, b), (c), (d.e, a.f) we will retain the field set {a, b, c, d}. This set is used as a constrained in the following way. If the index' side predicates's top-level field set is not a subset of the set above, we reject the rewrite. For example, if we have a sargable node referencing the field "f", we will not send it to the index side and thus reduce the number of explored plans.

      This heuristic helps reduce the exploration down from exponential but can be improved in the following ways:

      1. Keep track of the complete path instead of top-level fields.
      2. Particularly in the case of index intersection, there is probably a better way to constrain the set of rewrites. For example, we can say that the field "a" will be satisfied only on the left and field "b" only on the right.

      Ideally the approach will NOT reduce the number of valid splits, but reduce the number of redundant splits which do not lead to candidate indexes being generated on the child sargable nodes. Furthermore, we may consider reducing the maximum number of sargable node splits. We currently can split a sargable node up to 2 times, for a total of 4 nodes resulting from an input sargable node.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            svilen.mihaylov@mongodb.com Svilen Mihaylov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: