[SERVER-18287] Restore the "impossible match" heuristic present in the 2.4 query engine (no longer present as of 2.6) Created: 01/May/15  Updated: 29/Dec/17  Resolved: 20/May/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.9, 3.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Angshuman Bagchi (Inactive) Assignee: Unassigned
Resolution: Done Votes: 0
Labels: 2426
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-23732 Aggregation should optimize an irrele... Backlog
Operating System: ALL
Steps To Reproduce:

use test
db.test.insert({a : 1, b : 2, c : 3})
db.test.insert({a : 2, b : 3, c : 4})
db.test.insert({a : 3, b : 4, c : 5})
db.test.insert({a : 1, b : 3, c : 4})
db.test.insert({a : 1, b : 4, c : 5})

On MongoDB 3.0.2

> db.test.find({a : 1, b : {$in : [] }}).explain(true)
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.test",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [
				{
					"a" : {
						"$eq" : 1
					}
				},
				{
					"b" : {
						"$in" : [ ]
					}
				}
			]
		},
		"winningPlan" : {
			"stage" : "COLLSCAN",
			"filter" : {
				"$and" : [
					{
						"a" : {
							"$eq" : 1
						}
					},
					{
						"b" : {
							"$in" : [ ]
						}
					}
				]
			},
			"direction" : "forward"
		},
		"rejectedPlans" : [ ]
	},
	"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 0,
		"executionTimeMillis" : 0,
		"totalKeysExamined" : 0,
		"totalDocsExamined" : 5,
		"executionStages" : {
			"stage" : "COLLSCAN",
			"filter" : {
				"$and" : [
					{
						"a" : {
							"$eq" : 1
						}
					},
					{
						"b" : {
							"$in" : [ ]
						}
					}
				]
			},
			"nReturned" : 0,
			"executionTimeMillisEstimate" : 0,
			"works" : 7,
			"advanced" : 0,
			"needTime" : 6,
			"needFetch" : 0,
			"saveState" : 0,
			"restoreState" : 0,
			"isEOF" : 1,
			"invalidates" : 0,
			"direction" : "forward",
			"docsExamined" : 5
		},
		"allPlansExecution" : [ ]
	},
	"serverInfo" : {
		"host" : "angshuman-macbook.local",
		"port" : 30000,
		"version" : "3.0.2",
		"gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
	},
	"ok" : 1
}

On MongoDB 2.6.9

db.test.find({a : 1, b : {$in : [] }}).explain()
{
	"cursor" : "BasicCursor",
	"isMultiKey" : false,
	"n" : 0,
	"nscannedObjects" : 5,
	"nscanned" : 5,
	"nscannedObjectsAllPlans" : 5,
	"nscannedAllPlans" : 5,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"server" : "angshuman-macbook.local:30000",
	"filterSet" : false
}

On MongoDB 2.4.13

> db.test.find({a : 1, b : {$in : [] }}).explain()
{
	"cursor" : "BasicCursor",
	"isMultiKey" : false,
	"n" : 0,
	"nscannedObjects" : 0,
	"nscanned" : 0,
	"nscannedObjectsAllPlans" : 0,
	"nscannedAllPlans" : 0,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"indexBounds" : {
		
	},
	"server" : "angshuman-macbook.local:30000"
}

Participants:
Case:

 Description   

The 2.4 query engine had a heuristic in which it would look for an "impossible match", i.e. query predicates that could not possibly match any documents. If such a query was identified, then query execution could be short-circuited. Impossible match analysis no longer exists in 2.6.x and 3.0.x versions, which can lead to less efficient queries in these newer versions (see repro steps above). The most obvious example of an "impossible match" is an empty $in predicate.

Note that if the index bounds are empty, the 2.6 engine will properly handle this and realize that it does not need to scan the index:

> db.version()
2.6.9
> t.drop()
> t.insert({a: 1})
 
// Unindexed query has to do the collection scan.
> t.find({a: {$in: []}}).explain().nscannedObjects
1
 
// Indexed query doesn't need to look at any index keys or docs.
> t.ensureIndex({a: 1})
> t.find({a: {$in: []}}).explain().nscannedObjects
0

In 2.6 this behavior for indexed queries falls out of the regular query execution pipeline. In 2.4, however, the "impossible match" heuristic makes it so that empty bounds on even an unindexed field do not access any indices or data:

> db.version()
2.4.8
 
> t.drop()
> t.insert({a: 1})
 
// Unindexed query scans nothing due to "impossible match".
> t.find({a: {$in: []}}).explain().nscannedObjects
0



 Comments   
Comment by David Storch [ 20/May/15 ]

This behavior was purposefully removed in 2.6.0 as part of the query engine refactor. This is a class of queries that users should never send to the server. It is part of the philosophy of the rewritten query engine to not do special case query handling, as this analysis slows down other queries that don't require it. "Impossible match" queries should just go through the normal query optimization path, and if you don't have an index to efficiently support execution of your (arguably ill-formed) query, then query execution will of course be slow.

Closing as Works as Designed.

Generated at Thu Feb 08 03:47:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.