[SERVER-3104] index bound improvements for elemMatch query on multikey index Created: 16/May/11 Updated: 28/Oct/15 Resolved: 10/Oct/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 2.3.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Aaron Staple | Assignee: | Aaron Staple |
| Resolution: | Done | Votes: | 22 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
Mongo does not compute a cartesian product when creating a compound index on multiple fields. If the document { a:[ { b:1, c:2 }, { b:10, c:20 }] } is indexed according to index { 'a.b':1, 'a.c':1 }, the index keys created are { '':1, '':2 }and { '':10, '':20 }. (There is no index key { '':1, '':20 }for example.) A) Now, suppose we have a query { 'a.b':1, 'a.c':20 }. This query is supposed to match the document, because an 'a.b' value of 1 exists in the document, and an 'a.c' value of 20 exists in the document. However, there is no index key containing both 1 in the 'a.b' position and 20 in the 'a.c' position. As a result, the index bounds on 'a.b' will be [[ 1, 1 ]] but there will not be any index bounds on 'a.c'. This means the index key { '':1, '':2 }will be retrieved and used to find the full document, and the Matcher will determine that the full document matches the query. Here's a demo:
B) However, if $elemMatch is used in the query then an element of the 'a' array must match all the $elemMatch criteria. In other words, if the query becomes { a:{ $elemMatch: { b:1, c:20 }} } then the original document will not match because no element of a matches { b:1, c:20 }. In this case, index keys lacking a match on one field (like { '':1, '':2 }) need not be examined, and precise index bounds can be used for the 'a.c' field. Here's a demo:
This ticket implements the index bounds behavior seen in B. More precisely, our old behavior was that if two indexed field paths shared a common prefix, then only the first of those field paths appearing in the index would have its index bounds used for the query. With this ticket, if the index bounds for the two field paths come from the same $elemMatch clause then the index bounds on both field paths are used for the query. Additionally, this optimization is only applied if the field names within the $elemMatch are undotted. There are some cases where the optimization does not work correctly if the fields are dotted, described in , a:{ $elemMatch: { b:2, c:3 }} may use the field path [[10, max number]] on 'a.b' rather than [[2, 2]]. ---------------------
|
| Comments |
| Comment by auto [ 15/Feb/13 ] |
|
Author: {u'date': u'2013-02-14T17:04:29Z', u'name': u'Siddharth Singh', u'email': u'siddharth.singh@10gen.com'}Message: |
| Comment by auto [ 10/Oct/12 ] |
|
Author: {u'date': u'2012-07-31T16:10:59-07:00', u'email': u'aaron@10gen.com', u'name': u'Aaron'}Message: |
| Comment by Brian Adkins [ 24/Feb/12 ] |
|
I voted and watched, but I just wanted to add a comment re: how important this is to me. I have an app that is stuck on v1.8 due to the lack of this. I would love to be able to upgrade. Some more context on my particular issue is in this thread: |
| Comment by Kai Virkki [ 23/Feb/12 ] |
|
@Eliot: http://groups.google.com/group/mongodb-user/browse_thread/thread/f3ac1160d54f9881 |
| Comment by Eliot Horowitz (Inactive) [ 23/Feb/12 ] |
|
@kai - not sure - can you follow up on http://groups.google.com/group/mongodb-user/ with doc, query and explain output |
| Comment by Kai Virkki [ 22/Feb/12 ] |
|
Is my question on dba.stackexchange related to this same issue? http://dba.stackexchange.com/questions/13661/how-to-index-dynamic-attributes-in-mongodb |
| Comment by auto [ 17/May/11 ] |
|
Author: {u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}Message: |