[SERVER-23141] MongoDB is not selecting fastest query plan. Created: 15/Mar/16  Updated: 14/Apr/16  Resolved: 18/Mar/16

Status: Closed
Project: Core Server
Component/s: Performance, Querying
Affects Version/s: 3.3.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Saleem Assignee: Unassigned
Resolution: Done Votes: 0
Labels: query
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OSX, 10GB RAM, 512GB SSD, 2.5GH dual core


Attachments: Text File explain_for_fast_collection.txt     Text File explain_for_slow_collection.txt    
Operating System: ALL
Steps To Reproduce:

1. Create two collections, fast and slow and fill data with help of following script

var a = 0;
while(a++ < 4096*2){
  var randomnumber=Math.ceil(Math.random()*1000000)
 
  // create string in length of 3
  var name = Math.random().toString(36).substring(2,5);
 
  db.fast.insert({name:name, email:name + ".email@example.com", age:randomnumber});
  db.slow.insert({name:name, email:name + ".email@example.com", age:randomnumber});
 
}

On collection fast, create two indexes.

   db.createIndex({age:1});
   db.createIndex({age:1, name:1, email:1})

Now run following query on fast

pageNumber=18;
nPerPage=20; 
db.fast.find({
  age:{$gt:200, $lt:777217}, 
  name:{$gt:"1234", $lt:"z"}, 
  email:{$gt:"bdnsa28831283d", $lt:"z"}
}).skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).explain("allPlansExecution")

and on collection slow

pageNumber=18;
nPerPage=20; 
db.slow.find({
  age:{$gt:200, $lt:777217}, 
  name:{$gt:"1234", $lt:"z"}, 
  email:{$gt:"bdnsa28831283d", $lt:"z"}
}).skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).explain("allPlansExecution")

Please compare executionTimeMillis of both queries.

I have also noticed that query plan of fast is ignoring fastest plan.

I'm attaching full explain of both queries. Looking forward

Participants:

 Description   

Someone brought in my notice that MongoDB is not selecting fast query plan. During a test found that a fully covered index query is many fold slower than one with default index on _id field.



 Comments   
Comment by Saleem [ 18/Mar/16 ]

Wish you had given us chance before closing to verify covered query options too. But anyway, thanks for update

Comment by Kelsey Schubert [ 18/Mar/16 ]

Hi sixthsense,

Thank you for the report. This is expected behavior, and I'll go into to a bit more of an explanation of the results you are seeing.

First, the query predicates are not very selective: if we make the assumption that an examining a document takes the same amount of time as examining an index key, the fast collection will spend more time doing I/O.

        "totalKeysExamined" : NumberInt(563), 
        "totalDocsExamined" : NumberInt(360), 

compared to:

        "totalKeysExamined" : NumberInt(0), 
        "totalDocsExamined" : NumberInt(700), 

Second, since both queries execute quickly, the cost associated with the fast collection evaluating other query plans is more pronounced.

Third, I would like to clarify that the query in the ticket description is not a covered query. A covered query would include a projection to exclude the _id field:

db.fast.find({
  age:{$gt:200, $lt:777217}, 
  name:{$gt:"1234", $lt:"z"}, 
  email:{$gt:"bdnsa28831283d", $lt:"z"}},
  {_id : 0, age : 1, name : 1, email :1}
).skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).explain("allPlansExecution")

For additional information, please review our documentation on query optimization.

Kind regards,
Thomas

Comment by Saleem [ 15/Mar/16 ]

Please find result of explain for both type of collections.

slow is actually fast. Just one Index.
fast is actually slow, multiple indexes.

Please see actual thread on StackOverflow

Generated at Thu Feb 08 04:02:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.