[SERVER-22451] Allow predicates to be covered when indexed path is prefix of matched path Created: 03/Feb/16  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Viacheslav Usov Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Optimization
Participants:

 Description   

Consider the following script.

db.test.insert({_id : {a : 1, b : 1}})
db.test.insert({_id : {a : 1, b : 2}})
db.test.insert({_id : {a : 2, b : 1}})
db.test.insert({_id : {a : 2, b : 2}})
db.test.insert({_id : {a : 3, b : 1}})
db.test.insert({_id : {a : 3, b : 2}})
db.test.insert({_id : {a : 4, b : 1}})
db.test.insert({_id : {a : 4, b : 2}})
db.test.find({"$and" : [{_id : {"$gte" : {a : 2} }}, {_id : {"$lte" : {a : 3, b : MaxKey()} }}, {"_id.b" : 1}]})

==>

{ "_id" : { "a" : 2, "b" : 1 } }
{ "_id" : { "a" : 3, "b" : 1 } }

So far so good. Yet, doing explain() on that reveals the following:

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.test",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "_id.b" : {
                        "$eq" : 1
                    }
                },
                {
                    "_id" : {
                        "$lte" : {
                            "a" : 3,
                            "b" : { "$maxKey" : 1 }
                        }
                    }
                },
                {
                    "_id" : {
                        "$gte" : {
                            "a" : 2
                        }
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "$and" : [
                    {
                        "_id" : {
                            "$lte" : {
                                "a" : 3,
                                "b" : { "$maxKey" : 1 }
                            }
                        }
                    },
                    {
                        "_id" : {
                            "$gte" : {
                                "a" : 2
                            }
                        }
                    },
                    {
                        "_id.b" : {
                            "$eq" : 1
                        }
                    }
                ]
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "_id" : 1
                },
                "indexName" : "_id_",
                "isMultiKey" : false,
                "isUnique" : true,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "forward",
                "indexBounds" : {
                    "_id" : [
                        "[{ a: 2.0 }, { a: 3.0, b: MaxKey }]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 2,
        "executionTimeMillis" : 0,
        "totalKeysExamined" : 4,
        "totalDocsExamined" : 4,
        "executionStages" : {
            "stage" : "FETCH",
            "filter" : {
                "$and" : [
                    {
                        "_id" : {
                            "$lte" : {
                                "a" : 3,
                                "b" : { "$maxKey" : 1 }
                            }
                        }
                    },
                    {
                        "_id" : {
                            "$gte" : {
                                "a" : 2
                            }
                        }
                    },
                    {
                        "_id.b" : {
                            "$eq" : 1
                        }
                    }
                ]
            },
            "nReturned" : 2,
            "executionTimeMillisEstimate" : 0,
            "works" : 5,
            "advanced" : 2,
            "needTime" : 2,
            "needYield" : 0,
            "saveState" : 0,
            "restoreState" : 0,
            "isEOF" : 1,
            "invalidates" : 0,
            "docsExamined" : 4,
            "alreadyHasObj" : 0,
            "inputStage" : {
                "stage" : "IXSCAN",
                "nReturned" : 4,
                "executionTimeMillisEstimate" : 0,
                "works" : 5,
                "advanced" : 4,
                "needTime" : 0,
                "needYield" : 0,
                "saveState" : 0,
                "restoreState" : 0,
                "isEOF" : 1,
                "invalidates" : 0,
                "keyPattern" : {
                    "_id" : 1
                },
                "indexName" : "_id_",
                "isMultiKey" : false,
                "isUnique" : true,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "forward",
                "indexBounds" : {
                    "_id" : [
                        "[{ a: 2.0 }, { a: 3.0, b: MaxKey }]"
                    ]
                },
                "keysExamined" : 4,
                "dupsTested" : 0,
                "dupsDropped" : 0,
                "seenInvalidated" : 0
            }
        },
        "allPlansExecution" : [ ]
    },
    "serverInfo" : {
        "host" : "***",
        "port" : 27017,
        "version" : "3.2.1",
        "gitVersion" : "a14d55980c2cdc565d4704a7e3ad37e4e535c1b2"
    },
    "ok" : 1
}

I may be misinterpreting the output, but it seems to me that the filter

{"_id.b" : 1}

is applied AFTER the full document has been retrieved, NOT when scanning the index. I am led to this conclusion by seeing the filter spec in the FETCH stage and seeing

            "nReturned" : 2,
            "docsExamined" : 4,

in its stats.

So, basically, when the index is on a sub-document, and the query filter includes the sub-document's fields, the server does not seem to understand that it could use the index to evaluate the filter; it goes to the collection instead. That can be very costly.

Some background as to why this is important. The _id index is mandatory. It is therefore desirable to make it useful. It is also non-configurable, so one pretty much has to make _id a sub-document except in simplest cases.



 Comments   
Comment by Viacheslav Usov [ 08/Feb/16 ]

Unfortunately, the referenced documentation does not deal with this issue at hand. As far as I can tell, this issue is not addressed anywhere at all. So I suggest, if the feature does not make it (any time soon), explain in the documentation that a single-field index, if the field is a sub-document, cannot efficiently support range queries with additional filters on the sub-document's fields.

Comment by Kelsey Schubert [ 04/Feb/16 ]

Hi viaus,

Thank you for opening this ticket. Currently, MongoDB does not support the behavior that you are requesting. For additional information, please see our documentation regarding indexes on embedded documents.

I am marking this ticket as a new feature to be scheduled during the next round of planning. Updates will be posted here as they happen.

Kind regards,
Thomas

Generated at Thu Feb 08 04:00:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.