[SERVER-11563] Different index behavior for 1 and -1 direction indexes Created: 04/Nov/13  Updated: 10/Dec/14  Resolved: 11/Mar/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.4.6, 2.4.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Cory Mintz Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server11563.js    
Operating System: ALL
Participants:

 Description   

I have collection with about 4MM documents. There are two multikey indexes on two fields that are not present in all documents. The two indexes have the second field in a different direction.
i.e.

db.test_index.ensureIndex({"a.b" : 1, "a.c" : 1})
db.test_index.ensureIndex({"a.b" : 1, "a.c" : -1})

The result of doing a find, where the fields are null, against each index is dramatically different. (1 object scanned vs 1,055,153)

> db.test_index.find({"a.b": null, "a.c": null }, {"a.b": 1, "a.c": 1 }).hint({"a.b": 1, "a.c": 1 }).limit(1).explain()
{
	"cursor" : "BtreeCursor a.b_1_a.c_1",
	"isMultiKey" : true,
	"n" : 1,
	"nscannedObjects" : 1,
	"nscanned" : 1,
	"nscannedObjectsAllPlans" : 1,
	"nscannedAllPlans" : 1,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"indexBounds" : {
		"a.b" : [
			[
				null,
				null
			]
		],
		"a.c" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	}
}
 
> db.test_index.find({"a.b": null, "a.c": null }, {"a.b": 1, "a.c": 1 }).hint({"a.b": 1, "a.c": -1 }).limit(1).explain()
{
	"cursor" : "BtreeCursor a.b_1_a.c_-1",
	"isMultiKey" : true,
	"n" : 1,
	"nscannedObjects" : 1055153,
	"nscanned" : 1055153,
	"nscannedObjectsAllPlans" : 1055153,
	"nscannedAllPlans" : 1055153,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 177775,
	"indexBounds" : {
		"a.b" : [
			[
				null,
				null
			]
		],
		"a.c" : [
			[
				{
					"$maxElement" : 1
				},
				{
					"$minElement" : 1
				}
			]
		]
	}
}

The following is a short script to reproduce. After running this the 1 index will show one object scanned, while the -1 index will show two.

db.test_index.ensureIndex({"a.b" : 1, "a.c" : 1})
db.test_index.ensureIndex({"a.b" : 1, "a.c" : -1})
db.test_index.insert(
[
{
	"_id" : ObjectId("5198fb0acfe8202c49ab389d"),
	"a" : [
		{
			"c" : ISODate("2013-08-17T22:14:40Z"),
		},
		{
			"b" : ObjectId("5175284b50e71b43f29cd447"),
			"c" : ISODate("2013-08-17T22:15:18Z")
		}
	]
},
{ "_id" : ObjectId("4ec0cb4eb2a44b0a3bf6102d") },
{
	"_id" : ObjectId("4ec9f21100ad3b3ee3ca927a"),
	"a" : [
		{
			"b" : ObjectId("4f6c377f00aa296ff90001f3"),
			"c" : ISODate("2012-08-05T13:52:49Z")
		}
	]
}
]
)
 
db.test_index.find({ "a.b" : null, "a.c" : null }, { "a.b" : 1, "a.c" : 1 }).hint({ "a.b" : 1, "a.c" : 1 }).limit(-1).explain()
db.test_index.find({ "a.b" : null, "a.c" : null }, { "a.b" : 1, "a.c" : 1 }).hint({ "a.b" : 1, "a.c" : -1 }).limit(-1).explain()



 Comments   
Comment by David Storch [ 11/Mar/14 ]

I spoke with cory.mintz@10gen.com in person, and we decided that this is working as designed.

You might think that these queries will execute by looking up all of the index keys {"": null, "": null}. However, since the index is multikey and the fields "a.b" and "a.c" share prefix "a", the bounds cannot be combined in this fashion. Instead, the query will execute by looking up everything in the index for which "a.b"==null, then fetching from disk and filtering by "a.c"==null. The reason for this is somewhat obscure, but it's documented in the comments here: https://github.com/mongodb/mongo/blob/1ff73551566e30e1a2ca786115b5824bfeff188a/src/mongo/db/query/planner_access.h#L39.

When the index is scanned in the forwards direction, it hits the keys {"": null, "": null} first. On the other other hand, when the index is scanned in the backwards direction it must go through all the keys where "a.b" is null but "a.c" is non-null before it hits {"": null, "": null}. This explains the discrepancy in nscannedObjects and nscanned between the two index directions.

I think what you want to do is change the query to the following to ensure that it is covered by the index:

db.coll.find({a: {$elemMatch: {b: null, c: null}}}, {_id: 0, "a.b": 1, "a.c": 1});

Comment by Eliot Horowitz (Inactive) [ 04/Nov/13 ]

Slightly simpler example

Generated at Thu Feb 08 03:26:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.