[SERVER-37176] IndexBoundsBuilder::unionize() can be skipped for $in Created: 17/Sep/18  Updated: 29/Oct/23  Resolved: 02/Nov/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 4.1.5

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Jacob Evans
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-35693 Parsing of $in takes quadratic time d... Closed
is related to SERVER-30189 Reduce calls to allocator for large $... Closed
is related to SERVER-34012 Planner's logic for taking union of i... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 2018-10-22, Query 2018-11-05
Participants:

 Description   

We need to tread with care to ensure that this is correct in all cases, but it appears that we can skip some planning work while building bounds for an $in.

If an $in has no regexes, then it consists of a list of sorted, deduped equalities which take into account the collation. The purpose of calling IndexBoundsBuilder::unionize() is to ensure that the IndexBounds have properly ordered interval lists, and that any overlapping intervals are merged. However, if we process the $in in sorted order, then the intervals should already be sorted. And if the $in has already be deduped, then there cannot be any overlapping intervals.

IndexBoundsBuilder::unionize() has been shown to be slow for queries which have a large list of $in values (say, several hundred thousand distinct values). Local testing shows that skipping this work could speed such queries up possibly by as much as 40%.

Note that this ticket is distinct from SERVER-34012, which identifies quadratic time complexity specifically for $or queries. In contrast, the $in code path for bounds building is not quadratic.



 Comments   
Comment by Githook User [ 02/Nov/18 ]

Author:

{'name': 'Jacob Evans', 'email': 'jacob.evans@10gen.com'}

Message: SERVER-37176 Skip unionize for $in with no regex
Branch: master
https://github.com/mongodb/mongo/commit/0452eb9615e1cef8ed3fe704937e50a6bf473be3

Generated at Thu Feb 08 04:45:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.