[SERVER-69956] Better query planning for choosing columnstore index vs collscan Created: 24/Sep/22  Updated: 29/Oct/23  Resolved: 02/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Ian Boros Assignee: Alyssa Clark
Resolution: Fixed Votes: 0
Labels: pm2646-m4
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
Backwards Compatibility: Fully Compatible
Sprint: QE 2022-11-14, QE 2022-11-28, QE 2022-12-12
Participants:
Linked BF Score: 100

 Description   

Right now the columnstore index is used any time it is available and a collection scan would have been used instead. With our current implementation, there are cases where the column scan is significantly worse than a collection scan. For example, when the collection is small and fits entirely in memory, or when the documents are extremely small (<1kb). We expect the collection scan to beat the column scan in these cases now, and probably in the future.

 

This task is to determine (a) Should we do anything during query planning about this? Should we use our estimates of the collection's size and number of records to guess which plan is better?

If yes for (a), how should we decide between the two? A simple query knob "cutoff" value? Or something fancier?



 Comments   
Comment by Githook User [ 02/Dec/22 ]

Author:

{'name': 'Alyssa Wagenmaker', 'email': 'alyssa.wagenmaker@mongodb.com', 'username': 'awagenmaker'}

Message: SERVER-69956 Query planning heuristics for choosing columnscan
Branch: master
https://github.com/mongodb/mongo/commit/ad755125a09af6dc45e2cf696647f931a39566ce

Comment by Steve Tarzia [ 07/Oct/22 ]

We might also want to consider making this decision (column scan vs collection scan) during execution, or with the multi-planner.  It will be easier during runtime to get information about the ratio of projected field size to total document size.  Either way, I think the work can fall under this ticket.

Generated at Thu Feb 08 06:14:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.