Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 5.0.2
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization
Operating System:
ALL
Steps To Reproduce:
Hide

db.x.insert({meta:1, control:{min:{time:1},max:{time:1}}}) db.x.insert({meta:2, control:{min:{time:1},max:{time:1}}}) db.x.createIndex({meta:1,"control.min.time":1,"control.max.time":1}) db.x.find({},{"meta":1, "control.max.time":1,_id:0},).hint({meta:1,"control.min.time":1,"control.max.time":1}).explain()

Without the hint, the plan is a collection scan:

rs:PRIMARY> db.x.find({},{"meta":1, "control.max.time":1,_id:0},).explain().queryPlanner { "namespace" : "test.x", "indexFilterSet" : false, "parsedQuery" : { }, "queryHash" : "CCF36A8A", "planCacheKey" : "C36EB5A6", "maxIndexedOrSolutionsReached" : false, "maxIndexedAndSolutionsReached" : false, "maxScansToExplodeReached" : false, "winningPlan" : { "queryPlan" : { "stage" : "PROJECTION_DEFAULT", "planNodeId" : 2, "transformBy" : { "meta" : true, "control" : { "max" : { "time" : true } }, "_id" : false }, "inputStage" : { "stage" : "COLLSCAN", "planNodeId" : 1, "filter" : { }, "direction" : "forward" } }, "slotBasedPlan" : { "slots" : "$$RESULT=s13 $$RID=s5 env: { s2 = Timestamp(1628870938, 1) (CLUSTER_TIME), s1 = TimeZoneDatabase(Australia/NSW...US/Arizona) (timeZoneDB), s3 = 1628870945358 (NOW) }", "stages" : "[2] traverse s13 s12 s4 [s5] {} {} \nfrom \n [1] scan s4 s5 none none none none [] @\"f4710229-f7b9-46e0-ac36-260c26f8ab26\" true false \nin \n [2] cfilter {isObject (s4)} \n [2] mkbson s12 s4 [meta] keep [control = s11] true false \n [2] traverse s11 s10 s6 {} {} \n from \n [2] project [s6 = getField (s4, \"control\")] \n [2] limit 1 \n [2] coscan \n in \n [2] cfilter {isObject (s6)} \n [2] mkbson s10 s6 [] keep [max = s9] true false \n [2] traverse s9 s8 s7 {} {} \n from \n [2] project [s7 = getField (s6, \"max\")] \n [2] limit 1 \n [2] coscan \n in \n [2] cfilter {isObject (s7)} \n [2] mkbson s8 s7 [time] keep [] true false \n [2] limit 1 \n [2] coscan \n \n \n" } }, "rejectedPlans" : [ ] }
Show
db.x.insert({meta:1, control:{min:{time:1},max:{time:1}}}) db.x.insert({meta:2, control:{min:{time:1},max:{time:1}}}) db.x.createIndex({meta:1,"control.min.time":1,"control.max.time":1}) db.x.find({},{"meta":1, "control.max.time":1,_id:0},).hint({meta:1,"control.min.time":1,"control.max.time":1}).explain() Without the hint, the plan is a collection scan: rs:PRIMARY> db.x.find({},{"meta":1, "control.max.time":1,_id:0},).explain().queryPlanner { "namespace" : "test.x", "indexFilterSet" : false, "parsedQuery" : { }, "queryHash" : "CCF36A8A", "planCacheKey" : "C36EB5A6", "maxIndexedOrSolutionsReached" : false, "maxIndexedAndSolutionsReached" : false, "maxScansToExplodeReached" : false, "winningPlan" : { "queryPlan" : { "stage" : "PROJECTION_DEFAULT", "planNodeId" : 2, "transformBy" : { "meta" : true, "control" : { "max" : { "time" : true } }, "_id" : false }, "inputStage" : { "stage" : "COLLSCAN", "planNodeId" : 1, "filter" : { }, "direction" : "forward" } }, "slotBasedPlan" : { "slots" : "$$RESULT=s13 $$RID=s5 env: { s2 = Timestamp(1628870938, 1) (CLUSTER_TIME), s1 = TimeZoneDatabase(Australia/NSW...US/Arizona) (timeZoneDB), s3 = 1628870945358 (NOW) }", "stages" : "[2] traverse s13 s12 s4 [s5] {} {} \nfrom \n [1] scan s4 s5 none none none none [] @\"f4710229-f7b9-46e0-ac36-260c26f8ab26\" true false \nin \n [2] cfilter {isObject (s4)} \n [2] mkbson s12 s4 [meta] keep [control = s11] true false \n [2] traverse s11 s10 s6 {} {} \n from \n [2] project [s6 = getField (s4, \"control\")] \n [2] limit 1 \n [2] coscan \n in \n [2] cfilter {isObject (s6)} \n [2] mkbson s10 s6 [] keep [max = s9] true false \n [2] traverse s9 s8 s7 {} {} \n from \n [2] project [s7 = getField (s6, \"max\")] \n [2] limit 1 \n [2] coscan \n in \n [2] cfilter {isObject (s7)} \n [2] mkbson s8 s7 [time] keep [] true false \n [2] limit 1 \n [2] coscan \n \n \n" } }, "rejectedPlans" : [ ] }
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

On a collection of 4.3 GB, with an index of 14.8 MB, the query planner chooses a collection scan over a covered index scan.

> db.system.buckets.point_data.find({},{"meta":1, "control.max.time":1,_id:0},).hint({meta:1,"control.min.time":1,"control.max.time":1}).itcount()}}

This query takes 3.4s, but without the hint it take 40s.

The collection here is the system.buckets.point_data collection for a time-series collection, but the issue is not specific to time-series.

I understand that often a collection scan can be faster, but in this case the 290x size difference between the index and the collection size should have been decisive. Anyway, it doesn't look like the index scan was considered at all here.

is caused by

SERVER-23406 index scan is slower than full collection scan in some scenarios

Closed

is related to

SERVER-58276 Add time-series bounded collection scan plans to query multi-planner

Closed

related to

SERVER-20066 Query planner should consider index scans on empty query predicates

Closed

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Geert Bosch
Participants:: [DO NOT USE] Backlog - Query Optimization, Geert Bosch, George Mihailov, James Wahlin, Louis Williams
Votes:: 0 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Aug 13 2021 04:12:02 PM UTC
Updated:: Nov 15 2024 02:58:17 PM UTC
Resolved:: Nov 15 2024 02:58:17 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates