Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.0
Component/s: Querying
Labels:
None

Assigned Teams:

Query Optimization
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Suppose you have the equality to null query predicate {"a.b": null}. This predicate matches documents where the path "a.b" contains a literal null or where the path "a.b" is absent:

> db.version()
3.1.7-pre-
> db.c.drop()
true
> db.c.insert({a: {b: null}})
> db.c.insert({a: {b: 3}})
> db.c.insert({a: {}})
> db.c.find({"a.b": null})
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }
{ "_id" : ObjectId("55bff6ce59a0689eb67b396c"), "a" : {  } }

A sparse index with index key pattern {"a.b": 1} does not have index keys for documents missing the path "a.b". Therefore, if the equality to null query were to use a sparse index, it would miss results:

> db.c.ensureIndex({"a.b": 1}, {sparse: true})

// If we hint the query to use a sparse index, then we fail to return the
// document {a: {}}. For this reason, it is incorrect for the query engine to
// assign equality to null predicates to use sparse indices.
> db.c.find({"a.b": null}).hint({"a.b": 1})
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }

// Without the hint, the query engine gets the answer right using a collection
// scan instead of using the index.
> db.c.find({"a.b": null})
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }
{ "_id" : ObjectId("55bff6ce59a0689eb67b396c"), "a" : {  } }

Although the general rule is that equality to null predicates can't use sparse indices, this isn't always true. Consider the following example:

> db.version()
3.1.7-pre-
> db.c.drop()
true
> db.c.insert({a: {}, c: {}})
> db.c.insert({a: {b: 1}, c: {}})
> db.c.ensureIndex({"a.b": 1, "c.d": 1}, {sparse: true})
> db.c.find({"a.b": 1, "c.d": null})
{ "_id" : ObjectId("55bff81859a0689eb67b396e"), "a" : { "b" : 1 }, "c" : {  } }

This time we have a compound sparse index {"a.b": 1, "c.d": 1}. Compound sparse indices will only omit index keys for documents where both paths "a.b" and "c.d" are absent; the document {a: {}, c: {}} will not be indexed, but the document {a: {b: 1}, c: {}} will, because path "a.b" exists.

Due to these sparse index semantics, the check for "a.b" equal to 1 in the query ensures that all matching documents have index keys in the sparse index. As a result, it is actually safe to answer "c.d": null using the sparse index. However, the current query engine does not constrain the bounds on "c.d":

> db.c.find({"a.b": 1, "c.d": null}).explain("queryPlanner")
...
					"stage" : "IXSCAN",
					"keyPattern" : {
						"a.b" : 1,
						"c.d" : 1
					},
					"indexName" : "a.b_1_c.d_1",
					"isMultiKey" : false,
					"isUnique" : false,
					"isSparse" : true,
					"isPartial" : false,
					"indexVersion" : 1,
					"direction" : "forward",
					"indexBounds" : {
						"a.b" : [
							"[1.0, 1.0]"
						],
                                                // MISSING OPTIMIZATION! The bounds are not constrained.
						"c.d" : [
							"[MinKey, MaxKey]"
						]
...
}

As an optimization, we could use bounds [[null, null]] for the trailing field "c.d".

is related to

SERVER-36465 Non-multikey sparse indexes can be used to answer query with {$ne:null} predicate

Closed

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: David Storch
Participants:: [DO NOT USE] Backlog - Query Optimization, David Storch
Votes:: 3 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Aug 03 2015 11:34:57 PM UTC
Updated:: Dec 06 2022 04:47:05 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates