[SERVER-3173] Planner should use path-level multikey info to generate covered plans when possible Created: 31/May/11  Updated: 19/Sep/17  Resolved: 13/Jan/17

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: None
Fix Version/s: 3.5.2

Type: Improvement Priority: Major - P3
Reporter: Antoine Girbal Assignee: David Storch
Resolution: Done Votes: 32
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
is documented by DOCS-9937 Update docs to reflect multikey cover... Closed
Duplicate
is duplicated by SERVER-7595 indexOnly is never true when isMultiK... Closed
is duplicated by SERVER-4463 Allow covered index use for non-array... Closed
Related
related to SERVER-7959 Potentially unexpected scans with com... Closed
related to SERVER-15086 Allow for efficient range queries ove... Closed
is related to SERVER-8454 Multi-key index prevents bounds being... Closed
is related to SERVER-29550 Leverage multiKeyPath information to ... Backlog
Backwards Compatibility: Fully Compatible
Sprint: Query 2017-01-23
Participants:
Case:

 Description   

In this example, I want to get a list of user names for a particular event.

> db.col2.insert({name: "ag", event: ["open", "tourney"]})
> db.col2.insert({name: "joe", event: ["match", "event"]})
> db.col2.insert({name: "bill", event: ["open", "event"]})
> db.col2.ensureIndex({event: 1, name: 1})
> db.col2.find({event: "open"}, {name: 1, _id: 0})
{ "name" : "ag" }
{ "name" : "bill" }
> db.col2.find({event: "open"}, {name: 1, _id: 0}).explain()
{
	"cursor" : "BtreeCursor event_1_name_1",
	"nscanned" : 2,
	"nscannedObjects" : 2,
	"n" : 2,
	"millis" : 0,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : true,
	"indexOnly" : false,
	"indexBounds" : {
		"event" : [
			[
				"open",
				"open"
			]
		],
		"name" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	}
}

See that multikey is true even though the requested field (name) is not multikey.
Since we already track duplicate doc location, it should be easy to use covered index to return just name info.
That could speed up a lot certain queries.
We could move multiKey flag at the field level, and decide based on what is returned.

An alternative would be to add a hint flag to force covered index.
In this case output would be same.
In case of a true multikey, the array could be incomplete or with wrong order, but that would be expected.



 Comments   
Comment by Githook User [ 13/Jan/17 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-3173 use path-level multikey metadata to generate covered plans if possible

This allows queries using a multikey index which project out
the array fields to avoid collection access.
Branch: master
https://github.com/mongodb/mongo/commit/8953400c0f999bcb9da067edfb7978130516ac04

Comment by Aaron Staple [ 08/Nov/12 ]

One thing to note for the implementation is that if there is an array on a field that is not being projected, we would still need to dedup results.

For example,

index

{ a:1, b:1 }

projection

{ a:1, _id:0 }

doc

{ a:4, b:5 }

-> key

{ '':4, '':5 }

doc

{ a:4, b:[ 5, 6 ] }

-> two keys

{ '':4, '':5 }

,

{ '':4, '':6 }

We'd probably dedup based on disk loc to avoiding returning 4 two times, but deduping is not currently a requirement for a non multikey index.

Generated at Thu Feb 08 03:02:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.