Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10472

find(a).sort(b,_id) doesn't use index(a,b) to minimize scans

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 2.4.4
    • Fix Version/s: Backlog
    • Component/s: Indexing, Querying
    • Labels:
    • Environment:
      OSX, MongoLabs
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      db.temp.insert({_id:1, a:1, b:1})
      db.temp.insert({_id:2, a:1, b:2})
      db.temp.insert({_id:3, a:1, b:3})
      db.temp.insert({_id:4, a:1, b:4})
      db.temp.insert({_id:5, a:1, b:5})
      db.temp.insert({_id:6, a:1, b:6})
       
      db.temp.ensureIndex({a:1, b:1})
       
      // Scans 3 objects
      db.temp.find({a:1}).limit(3).sort({b:1}).explain()
       
      // Include _id as secondary sort field
      // Scans 6 objects.  Should only need to scan 4  (3 + enough to reach next unique b value)
      db.temp.find({a:1}).limit(3).sort({b:1, _id:1}).explain()

      Show
      db.temp.insert({_id:1, a:1, b:1}) db.temp.insert({_id:2, a:1, b:2}) db.temp.insert({_id:3, a:1, b:3}) db.temp.insert({_id:4, a:1, b:4}) db.temp.insert({_id:5, a:1, b:5}) db.temp.insert({_id:6, a:1, b:6})   db.temp.ensureIndex({a:1, b:1})   // Scans 3 objects db.temp.find({a:1}).limit(3).sort({b:1}).explain()   // Include _id as secondary sort field // Scans 6 objects. Should only need to scan 4 (3 + enough to reach next unique b value) db.temp.find({a:1}).limit(3).sort({b:1, _id:1}).explain()

      Description

      I have a collection of documents with keys _id, a, and b. I have an index {a:1,b:1}.

      When I do find(a).sort(b).limit(5) the index is used, and only 5 documents are scanned.

      When I do find(a).sort(b,_id).limit(5) (with or without a hint) it appears the index is only used to match a - as all documents with the matching value for a are scanned (nscanned and nscannedObjects are high).

      The behavior I would expect is that it would scan through a and b values enough to get to the limit (5), and then it would stop scanning when it found the next unique b (or a). It would then sort on _id within the scanned records.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                1 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated: