AND_HASH plan where first child is a multikey index scan de-duplicates twice

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying
    • None
    • Query Execution
    • Fully Compatible
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The IndexScan stage will automatically turn on its de-duplication logic if it finds that the index is multikey. The AND_HASH index intersection stage will also de-duplicate by default. It is wasteful to de-duplicate twice, as this requires keeping unnecessary in-memory state describing which query results have been seen so far.

      We should consider adding de-duplication analysis to the query planner's analysis phase. In particular, whether or not duplication is possible could be added as a property of a QuerySolutionNode, and could be computed via QuerySolutionNode::computeProperties(). The analysis phase would ensure that a plan de-duplicates at most once. No need to duplicate the de-duplication!

            Assignee:
            [DO NOT USE] Backlog - Query Execution
            Reporter:
            David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: