[SERVER-4520] $elemMatch on multiple sub-properties does not make full use of compound multi-key index Created: 18/Dec/11  Updated: 07/Mar/14  Resolved: 19/Dec/11

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Yair Halevi Assignee: Aaron Staple
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows


Issue Links:
Duplicate
duplicates SERVER-3104 index bound improvements for elemMatc... Closed
Operating System: ALL
Participants:

 Description   

Example

PRIMARY> db.try2.ensureIndex({"ext.s":1,"ext.k":1})
PRIMARY> db.try2.insert({"ext":[{"s":"s1","k":"key1"},{"s":"s2","k":"key1"}]})
PRIMARY> db.try2.insert({"ext":[{"s":"s1","k":"key2"}]})
PRIMARY> db.try2.find({"ext":{$elemMatch:{"s":"s1","k":"key1"}}}).explain()
{
        "cursor" : "BtreeCursor ext.s_1_ext.k_1",
        "nscanned" : 2,
        "nscannedObjects" : 2,
        "n" : 1,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : true,
        "indexOnly" : false,
        "indexBounds" : {
                "ext.s" : [
                        [
                                "s1",
                                "s1"
                        ]
                ],
                "ext.k" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        }
}

The query works, but it only uses the first key in the compound index, when it should be able to use both. It therefore scans both documents with "ext.s":"s1", instead of just the one matching document.

Note that without multi-keys (if all inserted documents have a single element in the array), both keys will be used.

I believe the use case here is fairly common. In our case, our documents can be associated with identifiers from 3rd party systems. We model each identifier as a triplet of properties (foreign-system-name,foreign-system-id-version,id). We'd like to query for documents using the combination of all 3 properties (in an $elemMatch fashion), but this won't make full use of the index.

Our alternatives for now are to concatenate the sub-properties into a single property or to order the keys in the index to ensure low cardinality. Both options aren't very pretty and have some limitations.



 Comments   
Comment by Aaron Staple [ 19/Dec/11 ]

Hi Yair,

We already have a ticket open for this, SERVER-3104. Please vote on that one if you'd like to see us implement this optimization.

Generated at Thu Feb 08 03:06:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.