Details
-
Improvement
-
Resolution: Unresolved
-
Major - P3
-
None
-
None
-
None
-
Query Execution
-
Fully Compatible
Description
The IndexScan stage will automatically turn on its de-duplication logic if it finds that the index is multikey. The AND_HASH index intersection stage will also de-duplicate by default. It is wasteful to de-duplicate twice, as this requires keeping unnecessary in-memory state describing which query results have been seen so far.
We should consider adding de-duplication analysis to the query planner's analysis phase. In particular, whether or not duplication is possible could be added as a property of a QuerySolutionNode, and could be computed via QuerySolutionNode::computeProperties(). The analysis phase would ensure that a plan de-duplicates at most once. No need to duplicate the de-duplication!