Column scan generally beats collection scans even when the collection and documents are small if the number of filters that get pushed down is large. This should be considered in the CSI plan selection heuristics.
Also, we should look at combining the document size and collection size heuristics in a more intelligent way. My sense is that the heuristic should be something like this:
Use a column scan only if:
- the collection is larger than RAM
- or the collection is smaller than RAM and
- the docs are large
- or a large number of filters (at least 2 or 3?) can be pushed down
This should let us choose column scan for the charts workload running on a low-memory instance and it may activate it for some appropriate queries on the regular instances.