-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
There have been some changes in index scanning code recently. It seems that full covered index scans are now faster than collection scans for simple queries.
If this is indeed the case, should we revisit the decision to always prefer collection scans in this case? Locally running
db.c.aggregate([{$match: {a:1}}, {$project: {_id:0, a:1}}, {$group: {_id: null, count: {$count: {}}}}]);
On ~300k identical {a:1} documents, seems that workingTimeMillis in the slow query log drops from about 141ms to 109ms when adding an index. In another case, running
db.c.aggregate([{$match: {a: {$mod: [2, 1]}}}, {$project: {_id:0, a:1}}, {$group: {_id: null, count: {$count: {}}}}]);
on ~350k identical {a:1, b:1} documents, seeing a slight drop from ~295ms to ~281ms when adding an index {a, b}. Here are the index bounds
"indexBounds" : { "a" : [ "[nan.0, inf.0]" ], "b" : [ "[MinKey, MaxKey]" ] }
There may be something I'm missing here - for instance, this isn't a thorough consideration of the EXACT, INEXACT_COVERED, INEXACT_FETCH cases, so I'm not sure if these examples are truly apples-to-apples comparisons. Without the group stage to force the query to read the entire table, IIRC, the collection scan case uses classic.
Chris Harris suggested adding something to QO triage to revisit this in light of the recent code changes. Chris also pointed to SERVER-23406 and SERVER-20066 as possibly related issues.
- related to
-
SERVER-20066 Query planner should consider index scans on empty query predicates
- Closed
-
SERVER-23406 index scan is slower than full collection scan in some scenarios
- Closed