In-progress work will change time series collections so that the RecordStore for the collection is clustered on the _id. The _id of a time series bucket collection encodes the min time of the bucket with second resolution. Therefore, if the query involves predicates on time, these predicates could end up rewritten as predicates on the bucket's _id (see related ticket SERVER-53758). In turn, we will extend the data access planner to produce bounded collection scan plans, taking advantage of the clustering on _id. Unlike traditional COLLSCAN plans, these plans will not have to examine the entire RecordStore, but rather will be able to scan just some range of the RecordStore. There is work tracked by SERVER-54008 and SERVER-54398 to make this work with the classic execution engine.
In the long run, it will also have to work with SBE. Luckily, the SBE scan operator already supports a generic concept of bounds. The work will be to ensure that given a QuerySolution with a bounded COLLSCAN, the SBE stage builder can convert this to the correct SBE execution plan.
- is related to
-
SERVER-54008 Generalize CollectionScan node so it can perform bounded scans over time series bucket collections
- Closed
-
SERVER-53758 Map predicates on min time to a portion of _id
- Closed
-
SERVER-54398 Extend query planner to generate bounded collection scan plans for time series collections
- Closed