[SERVER-20302] Strange execution plan for range queries with $elemMatch Created: 05/Sep/15  Updated: 08/Sep/15  Resolved: 08/Sep/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.0.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sergey Isaev Assignee: Sam Kleinman (Inactive)
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-15086 Allow for efficient range queries ove... Closed
Operating System: ALL
Steps To Reproduce:

Script to reproduce:

var createDoc = function(i) {
  return {
    name : 'object' + i,
    value : i,
  };
}
 
var createDocs = function(i) {
  return [
  	createDoc(i),
  	createDoc(i + 1),
  	createDoc(i + 2),
  	createDoc(i + 3),
  ];
}
 
for (var i = 0; i < 10000; ++i) {
  db.test.insert({
    collection : createDocs(i)
  })
}
 
db.test.createIndex({ 'collection.value' : 1 })
db.test.find({ collection : { $elemMatch : { value : { $gte : 9000, $lte : 9010 } } } }).explain()

I see the following output:

{ 
    "queryPlanner" : {
        "plannerVersion" : NumberInt(1), 
        "namespace" : "smartcat_isaev_dev.test", 
        "indexFilterSet" : false, 
        "parsedQuery" : {
            "collection" : {
                "$elemMatch" : {
                    "$and" : [
                        {
                            "value" : {
                                "$lte" : NumberInt(9010)
                            }
                        }, 
                        {
                            "value" : {
                                "$gte" : NumberInt(9000)
                            }
                        }
                    ]
                }
            }
        }, 
        "winningPlan" : {
            "stage" : "FETCH", 
            "filter" : {
                "collection" : {
                    "$elemMatch" : {
                        "$and" : [
                            {
                                "value" : {
                                    "$lte" : NumberInt(9010)
                                }
                            }, 
                            {
                                "value" : {
                                    "$gte" : NumberInt(9000)
                                }
                            }
                        ]
                    }
                }
            }, 
            "inputStage" : {
                "stage" : "IXSCAN", 
                "keyPattern" : {
                    "collection.value" : NumberInt(1)
                }, 
                "indexName" : "collection.value_1", 
                "isMultiKey" : true, 
                "direction" : "forward", 
                "indexBounds" : {
                    "collection.value" : [
                        "[-inf.0, 9010.0]"
                    ]
                }
            }
        }, 
        "rejectedPlans" : [
 
        ]
    }, 
    "serverInfo" : {
        "host" : "mongodb-dev", 
        "port" : NumberInt(27017), 
        "version" : "3.0.3", 
        "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105"
    }, 
    "ok" : NumberInt(1)
}

Participants:

 Description   

It seems like the query planner ignores one of the range ends, and scans the index from the beginning. Obviously the query is unreasonably slow on large collections - up to 1 hour on the collection with ~20M records. Am i missing something or is it a bug?



 Comments   
Comment by Sam Kleinman (Inactive) [ 08/Sep/15 ]

Thanks for this report, I've discussed this with our query team, and it looks like this is currently expected behavior. Because, the $elemMatch statement doesn't join the entire indexed path (i.e. value rather than collection.value), there's no way for the query planner to be able to ensure that the value field is not itself an array, and as a result queries must examine a large number of keys from the index to ensure that the result set is not missing documents that match the query predicate.

This issue is addressed by SERVER-15086. I'm sorry for the frustration this has caused, and I hope that the linked ticket will provide more clarity on this issue. I'm going to go ahead and close this issue now, but thanks again for the report.

Regards,
sam

Generated at Thu Feb 08 03:53:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.