Optimizer off-by-one, identical-meaning queries perform differently depending on use of $lt or $lte

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: 1.6.1
    • Component/s: Performance
    • None
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      I have a collection with an index on column 'y'.

      Running these equivalent queries:
      find({y:{$lte:'20101006'}}) or find({y:{$lt:'20101007'}})
      for some reason the $lte performs much better than $lt; it scans fewer rows.

      find({y:{$lte:'20101006'}}).count()
      10740593
      find({y:{$lt:'20101007'}}).count()
      10740593

      find({y:{$lte:'20101006'}}).explain()
      {
      "cursor" : "BtreeCursor y_1",
      "nscanned" : 10740593,
      "nscannedObjects" : 10740593,
      "n" : 10740593,
      "millis" : 10652,
      "indexBounds" :

      { "y" : [ [ "", "20101006" ] ] }

      }

      find({y:{$lt:'20101007'}}).explain()
      {
      "cursor" : "BtreeCursor y_1",
      "nscanned" : 21095326,
      "nscannedObjects" : 10740593,
      "n" : 10740593,
      "millis" : 22625,
      "indexBounds" :

      { "y" : [ [ "", "20101007" ] ] }

      }

      The difference seems to be that the $lt has to include all y with 20101007 even though it's going to throw them out.

      find(

      {y:'20101007'}

      ).count()
      10354733

      10354733 + 10740593 == 21095326

      It seems like there should be a way to stop scanning when you reach your upper limit with a $lt.

            Assignee:
            Aaron Staple (Inactive)
            Reporter:
            Jay Paroline
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: