Details
-
Improvement
-
Resolution: Unresolved
-
Major - P3
-
None
-
2.6.0
-
None
-
Query Optimization
-
(copied to CRM)
Description
Suppose you have the equality to null query predicate {"a.b": null}. This predicate matches documents where the path "a.b" contains a literal null or where the path "a.b" is absent:
> db.version()
|
3.1.7-pre-
|
> db.c.drop()
|
true
|
> db.c.insert({a: {b: null}})
|
> db.c.insert({a: {b: 3}})
|
> db.c.insert({a: {}})
|
> db.c.find({"a.b": null})
|
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }
|
{ "_id" : ObjectId("55bff6ce59a0689eb67b396c"), "a" : { } }
|
A sparse index with index key pattern {"a.b": 1} does not have index keys for documents missing the path "a.b". Therefore, if the equality to null query were to use a sparse index, it would miss results:
> db.c.ensureIndex({"a.b": 1}, {sparse: true})
|
|
|
// If we hint the query to use a sparse index, then we fail to return the
|
// document {a: {}}. For this reason, it is incorrect for the query engine to
|
// assign equality to null predicates to use sparse indices.
|
> db.c.find({"a.b": null}).hint({"a.b": 1})
|
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }
|
|
|
// Without the hint, the query engine gets the answer right using a collection
|
// scan instead of using the index.
|
> db.c.find({"a.b": null})
|
{ "_id" : ObjectId("55bff6c559a0689eb67b396a"), "a" : { "b" : null } }
|
{ "_id" : ObjectId("55bff6ce59a0689eb67b396c"), "a" : { } }
|
Although the general rule is that equality to null predicates can't use sparse indices, this isn't always true. Consider the following example:
> db.version()
|
3.1.7-pre-
|
> db.c.drop()
|
true
|
> db.c.insert({a: {}, c: {}})
|
> db.c.insert({a: {b: 1}, c: {}})
|
> db.c.ensureIndex({"a.b": 1, "c.d": 1}, {sparse: true})
|
> db.c.find({"a.b": 1, "c.d": null})
|
{ "_id" : ObjectId("55bff81859a0689eb67b396e"), "a" : { "b" : 1 }, "c" : { } }
|
This time we have a compound sparse index {"a.b": 1, "c.d": 1}. Compound sparse indices will only omit index keys for documents where both paths "a.b" and "c.d" are absent; the document {a: {}, c: {}} will not be indexed, but the document {a: {b: 1}, c: {}} will, because path "a.b" exists.
Due to these sparse index semantics, the check for "a.b" equal to 1 in the query ensures that all matching documents have index keys in the sparse index. As a result, it is actually safe to answer "c.d": null using the sparse index. However, the current query engine does not constrain the bounds on "c.d":
> db.c.find({"a.b": 1, "c.d": null}).explain("queryPlanner")
|
...
|
"stage" : "IXSCAN",
|
"keyPattern" : {
|
"a.b" : 1,
|
"c.d" : 1
|
},
|
"indexName" : "a.b_1_c.d_1",
|
"isMultiKey" : false,
|
"isUnique" : false,
|
"isSparse" : true,
|
"isPartial" : false,
|
"indexVersion" : 1,
|
"direction" : "forward",
|
"indexBounds" : {
|
"a.b" : [
|
"[1.0, 1.0]"
|
],
|
// MISSING OPTIMIZATION! The bounds are not constrained.
|
"c.d" : [
|
"[MinKey, MaxKey]"
|
]
|
...
|
}
|
As an optimization, we could use bounds [[null, null]] for the trailing field "c.d".
Attachments
Issue Links
- is related to
-
SERVER-36465 Non-multikey sparse indexes can be used to answer query with {$ne:null} predicate
-
- Closed
-