[SERVER-36818] $expr does not use multi-key index Created: 23/Aug/18  Updated: 27/Oct/23  Resolved: 24/Aug/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.3, 4.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Trevor Wilson Assignee: Nick Brewer
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to DOCS-11997 $expr cannot use a multikey index Closed
Operating System: ALL
Steps To Reproduce:

Start with empty collection and the index:

> db.comment.drop()
true
> db.comment.createIndex({ 'links.id': 1 })
{
        "createdCollectionAutomatically" : true,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}

Initially, the query strategy indicates it will use the correct index

> db.comment.find({$expr:{$eq:['$links.id', 'some-string']}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.comment",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "$expr" : {
                                                "$eq" : [
                                                        "$links.id",
                                                        {
                                                                "$const" : "some-string"
                                                        }
                                                ]
                                        }
                                },
                                {
                                        "links.id" : {
                                                "$_internalExprEq" : "some-string"
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                                "$expr" : {
                                        "$eq" : [
                                                "$links.id",
                                                {
                                                        "$const" : "some-string"
                                                }
                                        ]
                                }
                        },
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "links.id" : 1
                                },
                                "indexName" : "links.id_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "links.id" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "links.id" : [
                                                "[\"some-string\", \"some-string\"]"
                                        ]
                                }
                        }
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "caa1852655c1",
                "port" : 27017,
                "version" : "3.6.3",
                "gitVersion" : "9586e557d54ef70f9ca4b43c26892cd55257e1a5"
        },
        "ok" : 1
}

But when an element is inserted, it "discovers" that its a multi-key index

 

> db.comment.insert({ links: [{ id: 'some-string', type: 'test' }] })
WriteResult({ "nInserted" : 1 })

and the query no longer uses the correct index

> db.comment.find({$expr:{$eq:['$links.id', 'some-string']}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.comment",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "$expr" : {
                                                "$eq" : [
                                                        "$links.id",
                                                        {
                                                                "$const" : "some-string"
                                                        }
                                                ]
                                        }
                                },
                                {
                                        "links.id" : {
                                                "$_internalExprEq" : "some-string"
                                        }
                                }
                        ]
                },
                "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "$and" : [
                                        {
                                                "$expr" : {
                                                        "$eq" : [
                                                                "$links.id",
                                                                {
                                                                        "$const" : "some-string"
                                                                }
                                                        ]
                                                }
                                        },
                                        {
                                                "links.id" : {
                                                        "$_internalExprEq" : "some-string"
                                                }
                                        }
                                ]
                        },
                        "direction" : "forward"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "caa1852655c1",
                "port" : 27017,
                "version" : "3.6.3",
                "gitVersion" : "9586e557d54ef70f9ca4b43c26892cd55257e1a5"
        },
        "ok" : 1
}

 

Participants:

 Description   

An $expr expression in a query will not use a multi-key index. Ticket SERVER-31760 indicates this was fixed in 3.6.3 but still fails to work if the index is a multi-key index. This manifests only after an element is inserted that makes the index a multi-key index. 

A normal query will use the correct index, but, as with the motivation for SERVER-31760, it is desired to work within a $lookup pipeline where $expr would be required. 

The problem exists in 3.6.3 as well as 4.0.1



 Comments   
Comment by Asya Kamsky [ 03/Nov/20 ]

craig@bipsync.com I'm afraid no update on this.

Comment by Craig Marvelley [ 03/Nov/20 ]

Hi Asya. I don't suppose there's any update on when this alternate syntax for expressions might be available? I'm having issues with writing performant queries due to this limitation so it's relevant to my interests .

Comment by Asya Kamsky [ 20/May/19 ]

twilson sorry I missed your comment before.

You can see some related discussion in SERVER-32549 but basically using a multikey index for $in aggregation expression would return both matches for array members and matches for top level value, because in the index we don't differentiate if the value happened to be an array element or not. That means that in the example we were discussing using an index would match both documents where links.id is an array with a match contained and when it's not an array but subdocument, which would be incorrect

We are working on supporting an alternate syntax for expressions that would allow using indexes "normally" but unfortunately it's not yet scheduled for any particular version.

Comment by Trevor Wilson [ 24/Aug/18 ]

Ok, it is clear I had misunderstood $expr, thanks for clearing that up. 

However, is it truly the case that $expr won't use a multi-key index no matter the expression? Reworking to use $in yields the correct results but still doesn't use the index. 

db.comment.find({$expr:{$in:['some-string', '$links.id']}}).explain()

Comment by Asya Kamsky [ 24/Aug/18 ]

twilson

The aggregation expression

 {$eq:['$links.id', 'some-string']} 

is not equivalent to find expression

 {'links.id' :{$eq:'some-string'}} 

- in find this expression means links.id is either equal to 'some-string' or it's an array containing 'some-string'. In aggregation $eq is a strict equality only, and it does not reach into arrays.

When we tried always using indexes for $expr expressions, we discovered that incorrect results would be returned when index was multikey. When value being compared equal is not an array the semantics of both expressions is the same in find and aggregation.

In fact, if links.id is an array, the $expr you want to use would be $in and not $eq.

 

Comment by Trevor Wilson [ 24/Aug/18 ]

Really? Sounds strange to me that a normal query would use the index while the $expr equivalent does not...

In my use case it was used to $lookup related documents that were multi-key indexed, but, since it induces a collection scan to do the lookup, it was not performant and we had to abandon it in favor of doing the queries manually outside the aggregation pipeline. Is there any plans to make this optimization? Is there a better way? 

Comment by Nick Brewer [ 24/Aug/18 ]

twilson This behavior is actually expected - the $expr operator will not work with a multikey index.

I've opened a separate ticket to get some clarification on this in the documentation.

-Nick

Generated at Thu Feb 08 04:44:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.