[SERVER-10436] wrong index ranges when using compound index on a list Created: 05/Aug/13  Updated: 10/Dec/14  Resolved: 06/Dec/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Antoine Girbal Assignee: David Storch
Resolution: Duplicate Votes: 4
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-15086 Allow for efficient range queries ove... Closed
is duplicated by SERVER-10880 $elemMatch doesn't effectively use in... Closed
Related
is related to SERVER-10394 when using a generic index over a lis... Backlog
Operating System: ALL
Participants:

 Description   

Creating documents of this form with a compound index:

> for (var i = 0; i < 5000; ++i) { var arr = []; for (var j = 0; j < 10; ++j) { arr.push({n: "prop" + j, v: Math.floor(Math.random() * 1000) }) }; db.generic.insert({props: arr}) }
> db.generic.findOne()
{
  "_id": ObjectId("515dd3b4f0bd676b816aa9b0"),
  "props": [
    {
      "n": "prop0",
      "v": 40
    },
    {
      "n": "prop1",
      "v": 198
    },
...
    {
      "n": "prop9",
      "v": 652
    }
  ]
}
> db.generic.ensureIndex({"props.n": 1, "props.v": 1})

Matching with equality on the value works well:

> db.generic.find({props:{$elemMatch:{n:'prop1', v:800}}}).explain()
{
	"cursor" : "BtreeCursor props.n_1_props.v_1",
	"isMultiKey" : true,
	"n" : 143,
	"nscannedObjects" : 143,
	"nscanned" : 143,
	"nscannedObjectsAllPlans" : 143,
	"nscannedAllPlans" : 143,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 11,
	"indexBounds" : {
		"props.n" : [
			[
				"prop1",
				"prop1"
			]
		],
		"props.v" : [
			[
				800,
				800
			]
		]
	},
	"server" : "agmac.local:27017"
}

Range queries on value do not work well though, see the upper bound is largest possible integer:

> db.generic.find({props:{$elemMatch:{n:'prop1', v:{$gt:800, $lt: 1000}}}}).explain()
{
	"cursor" : "BtreeCursor props.n_1_props.v_1",
	"isMultiKey" : true,
	"n" : 25287,
	"nscannedObjects" : 25287,
	"nscanned" : 25287,
	"nscannedObjectsAllPlans" : 25287,
	"nscannedAllPlans" : 25287,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 114,
	"indexBounds" : {
		"props.n" : [
			[
				"prop1",
				"prop1"
			]
		],
		"props.v" : [
			[
				800,
				1.7976931348623157e+308
			]
		]
	},
	"server" : "agmac.local:27017"
}

If stacking a 2nd $elemMatch (just for kicks) then the ranges seem correct in explain but it does not match any document:

> db.generic.find({props:{$elemMatch:{n:'prop1', v:{$elemMatch: {$gt:800, $lt: 1000}}}}}).explain()
{
	"cursor" : "BtreeCursor props.n_1_props.v_1",
	"isMultiKey" : true,
	"n" : 0,
	"nscannedObjects" : 25287,
	"nscanned" : 25287,
	"nscannedObjectsAllPlans" : 25287,
	"nscannedAllPlans" : 25287,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 224,
	"indexBounds" : {
		"props.n" : [
			[
				"prop1",
				"prop1"
			]
		],
		"props.v" : [
			[
				800,
				1000
			]
		]
	},
	"server" : "agmac.local:27017"
}



 Comments   
Comment by David Storch [ 06/Dec/14 ]

All,

I am going to close this ticket as a duplicate of a more recently filed request (SERVER-15086) for improving the efficiency of certain queries over multikey indices. For now this is working as designed, although this is clearly an area for improvement in future releases.

The comments on SERVER-15086, especially this one and this one, provide lots of useful context for why the system works this way, and what sort of improvements might be possible. We also have a relatively new page in the documentation describing how index bounds are constructed for multikey indices.

Please watch SERVER-15086 for future updates, and feel free to reach out on that ticket with further questions or concerns.

Best,
Dave

Comment by Antoine Girbal [ 23/Oct/13 ]

kmeredith please try out the unstable 2.5.3 build that was just released, the index optimizer was thoroughly reworked.
About nscanned exceeding the count, that is probable since you actually have way more index entries than you have documents: each item in your list is an index entry.

Comment by Kevin Meredith [ 23/Oct/13 ]

Any status on this? Also, would this bug cause a problem with the `nScanned` exceeding the `db.collection.count()`?

I faced such a problem here - http://stackoverflow.com/questions/19526903/mongodb-index-ensureindex-time-to-refresh.

Comment by Antoine Girbal [ 10/Aug/13 ]

Those are related common use cases. SERVER-10394 has a workaround, SERVER-10436 does not.

Generated at Thu Feb 08 03:23:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.