[SERVER-21785] Explain with shell can return misleading plan for unindexed sort queries that specify limit or batchSize Created: 07/Dec/15  Updated: 10/Feb/16  Resolved: 10/Feb/16

Status: Closed
Project: Core Server
Component/s: Diagnostics, Querying, Shell
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: J Rassi Assignee: Tess Avitabile (Inactive)
Resolution: Won't Fix Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

There are two issues with the shell's explain helper implementation:

  1. When using "legacy" read mode (the default read mode when connected to 3.0 servers), the shell incorrectly uses the "limit", "batchSize", and "singleBatch" find command options with the explain command. The shell should be using the "ntoreturn" find command option instead.
  2. The shell always forces a negative limit value if a limit is specified, even if the limit value was not specified as negative by the user. The shell should be using a negative limit value only if specified by the user.

As a result, in either of the above cases, the shell will issue a slightly different query to the server depending on whether or not explain is being used. Especially for queries that are sensitive to the value of "ntoreturn", an entirely different plan can be selected.

For example, for unindexed sort queries that specify limit or batchSize:

  • Normal plan selection will choose an OR plan that uses a top-K sort for the first batch, and falls back to un-limited sort for subsequent batches.
  • Plan selection when explain is used will result in a plan that chooses either a top-K sort or an un-limited sort, without the OR fallback logic.

Reproduce as follows:

> db.getMongo().forceReadMode("legacy")
> db.foo.drop()
false
> db.createCollection('foo')
{ "ok" : 1 }
> db.foo.find().sort({a:1}).limit(10).itcount()
0
> db.foo.find().sort({a:1}).limit(10).explain()
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.foo",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$and" : [ ]
		},
		"winningPlan" : {
			"stage" : "SORT",
			"sortPattern" : {
				"a" : 1
			},
			"limitAmount" : 10,
			"inputStage" : {
				"stage" : "SORT_KEY_GENERATOR",
				"inputStage" : {
					"stage" : "COLLSCAN",
					"filter" : {
						"$and" : [ ]
					},
					"direction" : "forward"
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "rassi",
		"port" : 27017,
		"version" : "0.0.0",
		"gitVersion" : "none"
	},
	"ok" : 1
}

At log level 1, the server generates the following log lines:

2015-12-07T12:13:50.349-0500 I COMMAND  [conn1] command test.$cmd command: isMaster { isMaster: 1.0, forShell: 1.0 } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:163 locks:{} protocol:op_command 0ms
2015-12-07T12:13:54.795-0500 I QUERY    [conn1] query test.foo query: { query: {}, orderby: { a: 1.0 } } planSummary: COLLSCAN, COLLSCAN ntoreturn:10 ntoskip:0 keysExamined:0 docsExamined:0 hasSortStage:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:0 reslen:20 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } 0ms
2015-12-07T12:13:54.796-0500 I COMMAND  [conn1] command test.$cmd command: isMaster { isMaster: 1.0, forShell: 1.0 } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:163 locks:{} protocol:op_command 0ms
2015-12-07T12:14:01.586-0500 I COMMAND  [conn1] command test.foo command: explain { explain: { find: "foo", filter: {}, limit: 10.0, singleBatch: true, sort: { a: 1.0 } }, verbosity: "queryPlanner" } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:445 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } protocol:op_command 0ms

Notice that a single-leaf plan is chosen when explain() is used, and a two-leaf plan (the OR plan) is chosen when itcount() is used. Furthermore, notice that the "singleBatch" find command option is incorrectly added when explain() is used.



 Comments   
Comment by Tess Avitabile (Inactive) [ 10/Feb/16 ]

Issue (2) is fixed. Issue (1) will not be fixed due to the complexity that a fix would require. In versions after 3.0, we cannot accurately explain the query

db.foo.find().sort({a:1}).limit(10)

in legacy read mode. This is because find in legacy read mode uses OP_QUERY, but explain against a version 3.0 server or greater uses the find command. OP_QUERY finds use an OR plan that uses a top-K sort for the first batch, and falls back to un-limited sort for subsequent batches. The find command cannot produce this query plan.

Comment by Githook User [ 10/Feb/16 ]

Author:

{u'username': u'tessavitabile', u'name': u'Tess Avitabile', u'email': u'tess.avitabile@mongodb.com'}

Message: SERVER-21785 Explain with shell does not negate limit
Branch: master
https://github.com/mongodb/mongo/commit/067f160fb9fd6dab96e045183d89776cc554ac41

Generated at Thu Feb 08 03:58:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.