Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17806

AND_HASH plan where first child is a multikey index scan de-duplicates twice

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Querying
    • None
    • Query Execution
    • Fully Compatible

    Description

      The IndexScan stage will automatically turn on its de-duplication logic if it finds that the index is multikey. The AND_HASH index intersection stage will also de-duplicate by default. It is wasteful to de-duplicate twice, as this requires keeping unnecessary in-memory state describing which query results have been seen so far.

      We should consider adding de-duplication analysis to the query planner's analysis phase. In particular, whether or not duplication is possible could be added as a property of a QuerySolutionNode, and could be computed via QuerySolutionNode::computeProperties(). The analysis phase would ensure that a plan de-duplicates at most once. No need to duplicate the de-duplication!

      Attachments

        Activity

          People

            backlog-query-execution Backlog - Query Execution
            david.storch@mongodb.com David Storch
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: