Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37176

IndexBoundsBuilder::unionize() can be skipped for $in

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.1.5
    • Affects Version/s: None
    • Component/s: Querying
    • None
    • Fully Compatible
    • Query 2018-10-22, Query 2018-11-05

      We need to tread with care to ensure that this is correct in all cases, but it appears that we can skip some planning work while building bounds for an $in.

      If an $in has no regexes, then it consists of a list of sorted, deduped equalities which take into account the collation. The purpose of calling IndexBoundsBuilder::unionize() is to ensure that the IndexBounds have properly ordered interval lists, and that any overlapping intervals are merged. However, if we process the $in in sorted order, then the intervals should already be sorted. And if the $in has already be deduped, then there cannot be any overlapping intervals.

      IndexBoundsBuilder::unionize() has been shown to be slow for queries which have a large list of $in values (say, several hundred thousand distinct values). Local testing shows that skipping this work could speed such queries up possibly by as much as 40%.

      Note that this ticket is distinct from SERVER-34012, which identifies quadratic time complexity specifically for $or queries. In contrast, the $in code path for bounds building is not quadratic.

            jacob.evans@mongodb.com Jacob Evans
            david.storch@mongodb.com David Storch
            0 Vote for this issue
            9 Start watching this issue