[SERVER-12939] 2.4 prefers indexed sort plans when no other information is available, 2.6 does not Created: 27/Feb/14  Updated: 11/Jul/16  Resolved: 05/Mar/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.0-rc0
Fix Version/s: 2.6.0-rc1

Type: Bug Priority: Major - P3
Reporter: Ben Becker Assignee: David Storch
Resolution: Done Votes: 0
Labels: 26qa, performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Steps To Reproduce:

// Test 1: different plans; rc0 is slower
db.t.drop();
db.t.ensureIndex({a:1, b:1});
db.t.ensureIndex({d:1, e:1});
for (var i = 0; i < 1000; i++) { db.t.insert({a:i}); }
for (var i = 0; i < 10000; i++) { db.t.insert({a:"1", d:i}); }
db.version();
db.t.find({a:"1"}).sort({d:1}).explain();

// Test 2: same plans; rc0 is faster
db.t.drop();
db.t.ensureIndex({a:1, b:1});
db.t.ensureIndex({d:1, e:1});
for (var i = 0; i < 1000; i++) { db.t.insert({a:i}); }
for (var i = 0; i < 10000; i++) { db.t.insert({a:1, d:i}); }
db.version();
db.t.find({a:1}).sort({d:1}).explain();

Participants:

 Description   

In the attached 'Test 1' case, v2.6.0-rc0 selects a different query plan than v2.4.9, and v2.6.0-rc0 takes significantly longer to execute. The indexed key is a string with many duplicates.

In 'Test 2', the indexed key is an integer with many duplicates, and v2.6.0-rc0 is faster than v2.4.9, and selects the same query plan.

Note: the two repro cases attached are identical, except for the value type of 'a'

Explain for test 1:

> db.version();
2.6.0-rc1-pre-
> db.t.find({a:"1"}).sort({d:1}).explain();
{
	"cursor" : "BtreeCursor a_1_b_1",
	"isMultiKey" : false,
	"n" : 10000,
	"nscannedObjects" : 10000,
	"nscanned" : 10000,
	"nscannedObjectsAllPlans" : 10098,
	"nscannedAllPlans" : 10098,
	"scanAndOrder" : true,
	"indexOnly" : false,
	"nYields" : 158,
	"nChunkSkips" : 0,
	"millis" : 50,
	"indexBounds" : {
		"a" : [
			[
				"1",
				"1"
			]
		],
		"b" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "zzyzx:27018",
	"filterSet" : false
}
 
> db.version();
2.4.9
> db.t.find({a:"1"}).sort({d:1}).explain();
{
	"cursor" : "BtreeCursor d_1_e_1",
	"isMultiKey" : false,
	"n" : 10000,
	"nscannedObjects" : 11000,
	"nscanned" : 11000,
	"nscannedObjectsAllPlans" : 13203,
	"nscannedAllPlans" : 13203,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 35,
	"indexBounds" : {
		"d" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		],
		"e" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "zzyzx:27018"
}

Explain for test 2:

> db.version();
2.6.0-rc1-pre-
> db.t.find({a:1}).sort({d:1}).explain();
{
	"cursor" : "BtreeCursor d_1_e_1",
	"isMultiKey" : false,
	"n" : 10001,
	"nscannedObjects" : 11000,
	"nscanned" : 11000,
	"nscannedObjectsAllPlans" : 11197,
	"nscannedAllPlans" : 11198,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 87,
	"nChunkSkips" : 0,
	"millis" : 14,
	"indexBounds" : {
		"d" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		],
		"e" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "zzyzx:27018",
	"filterSet" : false
}
 
> db.version();
2.4.9
> db.t.find({a:1}).sort({d:1}).explain();
{
	"cursor" : "BtreeCursor d_1_e_1",
	"isMultiKey" : false,
	"n" : 10001,
	"nscannedObjects" : 11000,
	"nscanned" : 11000,
	"nscannedObjectsAllPlans" : 13201,
	"nscannedAllPlans" : 13201,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 24,
	"indexBounds" : {
		"d" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		],
		"e" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "zzyzx:27018"
}

FWIW, the delta in the NLTK c++ test is an order of magnitude, though the dataset varies widely (see QA/QA-401/QA401.cpp for details):

> db.regressions.findOne({$nor:[{qry:/where/}, {qry:/exists/}]})
{
	"_id" : ObjectId("530f6bad7e6c53525919dfad"),
	"deltaRatio" : "9.55",
	"deltaUs" : 917.67,
	"v249Avg" : "107.33",
	"v249Results" : [
		106,
		108,
		109,
		106,
		107,
		108
	],
	"rc0Avg" : "1025.00",
	"rc0Results" : [
		1005,
		1081,
		1011,
		1026,
		1013,
		1014
	],
	"testId" : "GeneratedTests::GeneratedQuery607[Compound1]",
	"qry" : "{ query: { a: \"\" }, orderby: { d: 1 } }",
	"proj" : "{ NONE: 1 }",
	"idx" : "Compound1",
	"rc0Count" : 0,
	"v249Count" : 0,
	"countsMatch" : true
}



 Comments   
Comment by Githook User [ 05/Mar/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-12939 prefer non-blocking sort plans when no plans produce or hit EOF
Branch: master
https://github.com/mongodb/mongo/commit/37abe9e1814e5c6ab9c18970690b157dc13e9ede

Generated at Thu Feb 08 03:30:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.