[SERVER-82253] Efficient IntervalRequirements for IN-lists Created: 17/Oct/23  Updated: 23/Nov/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Timour Katchaounov Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Sprint: QO 2023-10-30, QO 2023-11-13
Participants:

 Description   

Currently $in lists are represented in the Bonsai optimizer either as EqMember nodes, or as generic disjunctive IntervalRequirements. The latter optimization is the one that is being used for interval simplification and further optimizations.

This turns out to be inefficient for two main reasons:

  • Generic disjunctions do not carry the information present in $in - that it is over the same field, that it consists of point intervals, and eventually all intervals are unique. This results in unnecessary simplifications.
  • There is a lot of copying from one representation to another. This results in substantial overhead.

Both points above become visible for large $in lists - on the order of 1K - 10K and more.

 

The primary idea of this task is to implement a specialized representation of $in in the interval simplification and management framework. This should allow to not copy the list contents, and to avoid unnecessary simplifications. It can also be implemented in substantially more compact way.


Generated at Thu Feb 08 06:48:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.