[SERVER-7950] MongoDB not using the most optimal index when sorting on 2 keys Created: 15/Dec/12  Updated: 07/Mar/14  Resolved: 19/Dec/12

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: 2.2.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Vincent Bernat Assignee: Aaron Staple
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian GNU/Linux unstable.


Attachments: File setup.py    
Issue Links:
Depends
depends on SERVER-7945 use plan ordering when reporting "win... Closed
Related
Operating System: ALL
Steps To Reproduce:

1. Execute the script attached to populate the database.
2. Add one of the above index.
3. Run the request with explain.

You should get : no index used, scan and order used.

Participants:

 Description   

With the following request:

db.tasks.find({status: {$in: ["queued", "running"]},
  reserved: {$lt: now},
  type: {$in: ["most-common-type"]},
  tried: {$lt: 5},
  ns: "default"}).sort({priority: -1, _id: 1}).limit(1)

And one of the following indexes:

db.tasks.ensureIndex({priority: -1, _id: 1, ns:1, status: 1, type: 1, reserved: 1, tried: 1})
db.tasks.ensureIndex({ns: 1, priority: -1, _id: 1, status: 1})
db.tasks.ensureIndex({ns: 1, priority: -1, _id: 1})
db.tasks.ensureIndex({priority: -1, _id: 1})

MongoDB is never using the index, always using BasicCursor and a scan and order to get the appropriate result. The dataset is about 5 millions entries. The find() alone would return about 2 millions entries. The ns field is not very selective ("default" is 9 out of 10).

Another oddity that may be related is the fact that despite not using an index, MongoDB is able to not scan the whole dataset to get a result (nscannedObjects may be equal to 2 millions instead of 5).

Without index, MongoDB is able to answer the request in about 10s. By hinting the last specified index, I am able to get an answer in 3s and no scan and order.



 Comments   
Comment by Vincent Bernat [ 18/Dec/12 ]

Oh, I just had a look at SERVER-7945. You can discard my question then.

Comment by Vincent Bernat [ 18/Dec/12 ]

Hi Aaron!

Thanks for the explanation. What about the "scanAndOrder"? Shouldn't be able to use the index to avoid a sort?

Comment by Aaron Staple [ 17/Dec/12 ]

Hi Vincent,

I think the main issue here is that you are running into SERVER-7945, where in certain cases the wrong cursor type is reported at the top of the explain output. The index is still being used, but explain is incorrectly reporting the BasicCursor as being used. You can see that an index is being used by checking the out output of explain( true ).

In the case below, the indexed (ordered) cursor runs until it finds its first match, and then the query finishes. At that time, the unindexed cursor has found two matches (though it hasn't found all results or done any sorting).

c.ensureIndex( {priority: -1, _id: 1, ns:1, status: 1, type: 1, reserved: 1, tried: 1} );
printjson( c.find({status: {$in: ["queued", "running"]},
                reserved: {$lt: new Date()},
            type: {$in: ["most-common-type"]},
            tried: {$lt: 5},
                ns: "default"}).sort({priority: -1, _id: 1}).limit(1).explain( true ) );

{
	"cursor" : "BasicCursor",
	"isMultiKey" : false,
	"n" : 1,
	"nscannedObjects" : 238623,
	"nscanned" : 238623,
	"nscannedObjectsAllPlans" : 238624,
	"nscannedAllPlans" : 477261,
	"scanAndOrder" : true,
	"indexOnly" : false,
	"nYields" : 1,
	"nChunkSkips" : 0,
	"millis" : 1604,
	"indexBounds" : {
		
	},
	"allPlans" : [
		{
			"cursor" : "BtreeCursor priority_-1__id_1_ns_1_status_1_type_1_reserved_1_tried_1 multi",
			"n" : 1,
			"nscannedObjects" : 1,
			"nscanned" : 238638,
			"indexBounds" : {
				"priority" : [
					[
						{
							"$maxElement" : 1
						},
						{
							"$minElement" : 1
						}
					]
				],
				"_id" : [
					[
						{
							"$minElement" : 1
						},
						{
							"$maxElement" : 1
						}
					]
				],
				"ns" : [
					[
						"default",
						"default"
					]
				],
				"status" : [
					[
						"queued",
						"queued"
					],
					[
						"running",
						"running"
					]
				],
				"type" : [
					[
						"most-common-type",
						"most-common-type"
					]
				],
				"reserved" : [
					[
						true,
						ISODate("2012-12-17T08:29:16.431Z")
					]
				],
				"tried" : [
					[
						-1.7976931348623157e+308,
						5
					]
				]
			}
		},
		{
			"cursor" : "BasicCursor",
			"n" : 2,
			"nscannedObjects" : 238623,
			"nscanned" : 238623,
			"indexBounds" : {
				
			}
		}
	],
	"server" : "Aarons-MacBook-Pro.local:27017"
}

Comment by Vincent Bernat [ 15/Dec/12 ]

The correct way to populate the database.

Generated at Thu Feb 08 03:16:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.