[SERVER-66177] Optimize time-series sorting on multikey index Created: 03/May/22  Updated: 25/Jan/24  Resolved: 25/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Percy Assignee: Backlog - Query Integration
Resolution: Duplicate Votes: 0
Labels: qi-timeseries
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-83348 Extend bounded sort optimization for ... Backlog
Related
Assigned Teams:
Query Integration
Participants:

 Description   

When you sort by time or (meta, time) on a time-series collection, we're currently not attempting to optimize any case where the index is multikey.

Multikey indexes have more than one index entry per document, and scanning the index produces each document once, the first time it encounters one of its index entries. This means scanning a subset of the index can produce documents in a different order than scanning the whole index.

For time-series (when sorting by time or (meta, time), some cases we probably could improve are:

  • When the multikey scan has no bounds.
  • When the only multikey fields are irrelevant trailing fields.


 Comments   
Comment by Gil Alon [ 25/Jan/24 ]

This is a duplicate of SERVER-83348 and I added the description of this ticket to SERVER-83348. SERVER-83348 describes one other extension, and this one describes 2 extensions. We should try and and do all of them in one ticket.

Comment by Gil Alon [ 11/Jan/24 ]

Would SERVER-83348 also be relevant in this case?

Comment by Ana Meza [ 08/Nov/22 ]

rushan.chen@mongodb.com - during QE Triage we were wondering if your team has capacity to work on this ticket if not maybe send it to Backlog?

Comment by David Percy [ 25/Oct/22 ]

I don't know of a specific workload, no. "Nice to have" is probably a good description--this is an optimization we just chose not to implement yet.

Comment by Rushan Chen [ 25/Oct/22 ]

Are there known workload having such index fields with multiple values per doc? or this is a nice to have? If the latter, it shouldn't be part of the project. 

Can def schedule this as a separate improvement.

Comment by Ana Meza [ 25/Oct/22 ]

rushan.chen@mongodb.com rui.liu@mongodb.com we reviewed this ticket during QO Triage Quick Wins and we were wondering if we can add this ticket to PM-3050?

Generated at Thu Feb 08 06:04:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.