[SERVER-65159] Consider {a: 1, b: 1} index to satisfy sort on {a: 1, b: -1} Created: 01/Apr/22  Updated: 05/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Charlie Swanson Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-62405 Partially-streaming compound $sort Backlog
related to SERVER-69027 [CQF] Support for Recursive Index Nav... Closed
is related to SERVER-69359 Aggregate query bails on DISTINCT_SCA... Closed
Assigned Teams:
Query Optimization
Participants:

 Description   

There are two kinds of query plans we may consider here other than a full blocking sort:

  1. For each distinct value of a (in order ascending), scan the entire b sub-range in the reverse direction. This would be a streaming plan.
  2. For each distinct value of a (in order ascending), scan the entire b sub-range in order but put them into a blocking sort for just that sub-range. This should reverse them all but keep the disk accesses sequential. This is semi-streaming in that it doesn't need to block for all results, but will stop and start blocking for each "a" value.

Either of these plans will involve a high number of seek calls if "a" is high cardinality, which could make them not worth it compared to a blocking sort plan. The first plan would be better for very low "a" cardinality, and the second would be better for when there are very few number of "b" values per value of "a". So such plans would need to be considered with the query planner, ideally in a cost-based manner since our current cost model doesn't seem like it would take into account that the non-linear data accesses would be more expensive.



 Comments   
Comment by David Percy [ 22/Nov/22 ]

I think we'll be able to do this in Bonsai with recursive index navigation. The outer loop scans forward producing unique values of 'a', and the inner loop scans backward to get all the (a, b) entries within that value of 'a'.

Comment by Ana Meza [ 25/Oct/22 ]

Sending to backlog since probably not a quick win, also might be something difficult for our current query planner to use effectively - We expect Bonsai to be needed before we work on this ticket

Generated at Thu Feb 08 06:02:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.