Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12923

Plan ranking is bad for plans with blocking stages

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: 2.4.8, 2.6.0-rc0
    • Fix Version/s: Backlog
    • Component/s: Querying
    • Operating System:
      ALL
    • Case:

      Description

      Currently we rank plans by work()'ing them some number of times and seeing who produces the most results.

      If there is a plan with a blocking stage, it will therefore almost always be ranked lower than plans without a blocking stage. This is because plans without a blocking stage immediately start producing results. Although the plan with the blocking stage doesn't appear to be making progress immediately, it may end up looking much better / hitting EOF much faster if we were to run the plans to completion.

      Blocking stages for which this is an issue:

      • SORT
      • AND_HASH

      Here's one way to reproduce the issue, for the blocking SORT stage:

      > t = db.t
      test.t
      > t.drop()
      true
      > t.ensureIndex({a: 1})
      WriteResult({ "nInserted" : 1 })
      > t.ensureIndex({b: 1})
      WriteResult({ "nInserted" : 1 })
      > for (var i = 0; i < 10000; i++) { t.save({a: 1}); }
      WriteResult({ "nInserted" : 1 })
      > for (var i = 0; i < 150; i++) { t.save({b: 1}); }
      WriteResult({ "nInserted" : 1 })
      > t.find({b: 1}).sort({a: 1}).explain()
      {
      	"cursor" : "BtreeCursor a_1",
      	"isMultiKey" : false,
      	"n" : 150,
      	"nscannedObjects" : 10150,
      	"nscanned" : 10150,
      	"nscannedObjectsAllPlans" : 10347,
      	"nscannedAllPlans" : 10348,
      	"scanAndOrder" : false,
      	"indexOnly" : false,
      	"nYields" : 85,
      	"nChunkSkips" : 0,
      	"millis" : 139,
      	"indexBounds" : {
      		"a" : [
      			[
      				{
      					"$minElement" : 1
      				},
      				{
      					"$maxElement" : 1
      				}
      			]
      		]
      	},
      	"server" : "localhost:27017",
      	"filterSet" : false
      }

      The query ends up using index {a: 1}, which involves scanning 10,000 docs and fetching them from disk if they're not in memory. We would have liked to use index {b: 1}, which would involve fetching only 150 docs and then sorting them in memory.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                10 Vote for this issue
                Watchers:
                27 Start watching this issue

                Dates

                • Created:
                  Updated: