[SERVER-13611] Missing sort order for compound index leads to unnecessary in-memory sort Created: 16/Apr/14  Updated: 11/Jul/16  Resolved: 17/Apr/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.0
Fix Version/s: 2.6.1, 2.7.0

Type: Bug Priority: Major - P3
Reporter: Steve Briskin (Inactive) Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-13618 Optimization for sorted $in queries n... Closed
Operating System: ALL
Backport Completed:
Participants:

 Description   
Issue Status as of April 18, 2014

ISSUE SUMMARY
In query execution, the index scan phase returns a stream of documents and a set of all sort orders that are satisfied by the stream. Version 2.6.0 contains a logical bug that in some cases misses a sort order in this set. This can lead to an additional in-memory sort phase that would not be necessary.

USER IMPACT
This is a behavior change from version 2.4 that can negatively impact performance of the affected queries. In cases where the size of returned documents exceeds the limit of 32MB, the query fails with an error.

WORKAROUNDS
In most cases, an appropriate index can be found that supports the sort. See the ticket comments for details and an example.

RESOLUTION
The patch fixes a logical error in the query execution code and now properly returns all sort orders.

AFFECTED VERSIONS
Version 2.6.0 is affected by this issue.

PATCHES
The patch is included in the 2.6.1 production release.

Original description

Suppose you have a compound index {a: 1, b: 1, c: 1, d: 1} and a query with point intervals over 'a' and 'b'. The index scan can provide the following sort orders (both ascending and descending):

{a: 1}, {a: 1, b: 1}, {a: 1, b: 1, c: 1}, {a: 1, b: 1 c: 1, d: 1}, {b: 1, c: 1, d: 1}, {c: 1, d: 1}, and {c: 1}

The code as it stands considers all of these sort orders except for one: {c: 1}. Specifically, it misses prefixes of the sort orders which do not begin with the first field in the compound index.

There appears to be a regression in the sort phase between 2.6.0 and 2.4.8.
It's easiest to illustrate it with an an example. In 2.4.8 the sort uses the index, whereas in 2.6.0 it doesn't, which in our case leads to an overflow error.

db.slices.drop()
var c = Array(250000);
for(i=0; i<250000; i++) c[i] = i;
 
db.slices.ensureIndex({group:1, rs:1, timestamp:1, _id : 1 });
for(i = 0; i < 100; i++) { db.slices.insert({group : "A", rs: "name", timestamp : new Date(), dan : "bp", data:c});}

In 2.4.8

> db.slices.find({group : "A", rs:"name"}, {data : 0}).sort({timestamp : -1}).explain()
{
	"cursor" : "BtreeCursor group_1_rs_1_timestamp_1__id_1 reverse",
	"isMultiKey" : false,
	"n" : 100,
	"nscannedObjects" : 100,
	"nscanned" : 100,
	"nscannedObjectsAllPlans" : 100,
	"nscannedAllPlans" : 100,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"indexBounds" : {
		"group" : [
			[
				"A",
				"A"
			]
		],
		"rs" : [
			[
				"name",
				"name"
			]
		],
		"timestamp" : [
			[
				{
					"$maxElement" : 1
				},
				{
					"$minElement" : 1
				}
			]
		],
		"_id" : [
			[
				{
					"$maxElement" : 1
				},
				{
					"$minElement" : 1
				}
			]
		]
	},
	"server" : "boxster:27017"
}

In 2.6.0:

db.slices.find({group : "A", rs:"name"}, {data : 0}).sort({timestamp : -1}).explain()
2014-04-16T11:23:57.863-0400 error: {
	"$err" : "Runner error: Overflow sort stage buffered data usage of 35000910 bytes exceeds internal limit of 33554432 bytes",
	"code" : 17144

If the sort is changed to

{timestamp : 1}

to match the direction of the index, the query behaves as expected in 2.6.0.

> db.slices.find({group : "A", rs:"name"}, {data : 0}).sort({timestamp : 1}).explain()
{
	"clauses" : [
		{
			"cursor" : "BtreeCursor group_1_rs_1_timestamp_1__id_1",
			"isMultiKey" : false,
			"n" : 100,
			"nscannedObjects" : 100,
			"nscanned" : 100,
			"scanAndOrder" : false,
			"indexOnly" : false,
			"nChunkSkips" : 0,
			"indexBounds" : {
				"group" : [
					[
						"A",
						"A"
					]
				],
				"rs" : [
					[
						"name",
						"name"
					]
				],
				"timestamp" : [
					[
						{
							"$minElement" : 1
						},
						{
							"$maxElement" : 1
						}
					]
				],
				"_id" : [
					[
						{
							"$minElement" : 1
						},
						{
							"$maxElement" : 1
						}
					]
				]
			}
		}
	],
	"cursor" : "QueryOptimizerCursor",
	"n" : 100,
	"nscannedObjects" : 100,
	"nscanned" : 100,
	"nscannedObjectsAllPlans" : 100,
	"nscannedAllPlans" : 100,
	"scanAndOrder" : false,
	"nYields" : 1,
	"nChunkSkips" : 0,
	"millis" : 0,
	"server" : "boxster:27017",
	"filterSet" : false
}

Credit: danny.gottlieb@gmail.com



 Comments   
Comment by Rudi Wijaya [ 05/May/14 ]

Related to : SERVER-13831 Find with $gte/$gt/$lte/$lt on Date field and sorting on another field leads to unnecessary in-memory sort without using index

Comment by Githook User [ 17/Apr/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-13611 fix sort elimination in the case of trailing fields
(cherry picked from commit 9ec25e399fc775361ed37fd5abb78960659a308b)
Branch: v2.6
https://github.com/mongodb/mongo/commit/c27685781d30a2646b5e75270ee81b6954303210

Comment by Githook User [ 17/Apr/14 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-13611 fix sort elimination in the case of trailing fields
Branch: master
https://github.com/mongodb/mongo/commit/9ec25e399fc775361ed37fd5abb78960659a308b

Comment by David Storch [ 16/Apr/14 ]

For both the sorts on {timestamp: 1} and {timestamp: -1}, we are erroneously falling through to the "explode for scans" logic (see SERVER-1205). The difference in the behavior for the ascending and descending cases can be attribute to a problem in explodeForSort: the explode logic does not consider reversing the index scans in order to get the requested sort order.

So there are actually two problems here:

  1. We don't properly identify all sort orders provided by an index scan (covered by this issue).
  2. Explode for sort does not check whether it can get the needed sort order by reversing the child index scans (covered by SERVER-13618).
Comment by David Storch [ 16/Apr/14 ]

This is a bug in how we check for the sort orders provided by index scans during sort analysis (see https://github.com/mongodb/mongo/blob/fad935606f3d7edc4bae30f8f15288a144b4785d/src/mongo/db/query/planner_analysis.cpp#L360-L373).

We fail to recognize that we can get the desired sort order when it is a prefix of a sort order which the index scan claims to provide. As a result, the example query will work as expected if you drop the index on {group:1, rs:1, timestamp:1, _id : 1} and instead build an index on {group:1, rs:1, timestamp:1}. The bug is triggered by a trailing field in the index key pattern (in this case, the trailing "_id" field).

Generated at Thu Feb 08 03:32:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.