[SERVER-1910] Optimizer off-by-one, identical-meaning queries perform differently depending on use of $lt or $lte Created: 08/Oct/10  Updated: 17/Mar/11  Resolved: 08/Oct/10

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 1.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jay Paroline Assignee: Aaron Staple
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

I have a collection with an index on column 'y'.

Running these equivalent queries:
find({y:{$lte:'20101006'}}) or find({y:{$lt:'20101007'}})
for some reason the $lte performs much better than $lt; it scans fewer rows.

find({y:{$lte:'20101006'}}).count()
10740593
find({y:{$lt:'20101007'}}).count()
10740593

find({y:{$lte:'20101006'}}).explain()
{
"cursor" : "BtreeCursor y_1",
"nscanned" : 10740593,
"nscannedObjects" : 10740593,
"n" : 10740593,
"millis" : 10652,
"indexBounds" :

{ "y" : [ [ "", "20101006" ] ] }

}

find({y:{$lt:'20101007'}}).explain()
{
"cursor" : "BtreeCursor y_1",
"nscanned" : 21095326,
"nscannedObjects" : 10740593,
"n" : 10740593,
"millis" : 22625,
"indexBounds" :

{ "y" : [ [ "", "20101007" ] ] }

}

The difference seems to be that the $lt has to include all y with 20101007 even though it's going to throw them out.

find(

{y:'20101007'}

).count()
10354733

10354733 + 10740593 == 21095326

It seems like there should be a way to stop scanning when you reach your upper limit with a $lt.



 Comments   
Comment by Aaron Staple [ 08/Oct/10 ]

This should already be implemented in 1.7.0, by SERVER-726

Generated at Thu Feb 08 02:58:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.