Right now the columnstore index is used any time it is available and a collection scan would have been used instead. With our current implementation, there are cases where the column scan is significantly worse than a collection scan. For example, when the collection is small and fits entirely in memory, or when the documents are extremely small (<1kb). We expect the collection scan to beat the column scan in these cases now, and probably in the future.
This task is to determine (a) Should we do anything during query planning about this? Should we use our estimates of the collection's size and number of records to guess which plan is better?
If yes for (a), how should we decide between the two? A simple query knob "cutoff" value? Or something fancier?