[SERVER-24786] Compound multikey index does not compound indexbounds when nested in an object Created: 24/Jun/16  Updated: 14/Jul/16  Resolved: 29/Jun/16

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.10, 2.6.11
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Werner Smit Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-15086 Allow for efficient range queries ove... Closed
Related
related to SERVER-22401 Implement index bounds generation rul... Closed
Operating System: ALL
Steps To Reproduce:

#Unbound predicate scenario

use test_db;
db.c.drop();
db.c.save({nest: {a:'hello', b: ['hello', 'world']}});
db.c.save({nest: {a:'world', b: ['hello', 'world']}});
db.c.ensureIndex({"nest.a":1, "nest.b":1});
db.c.find({"nest.a": 'hello', "nest.b": 'world'}).explain();
 
//Working scenario (non-nested)
use test_db;
db.c.drop();
db.c.save({a: 'hello', b: ['hello', 'world']});
db.c.save({a: 'world', b: ['hello', 'world']});
db.c.ensureIndex({"a":1, "b":1});
db.c.find({"a": 'hello', "b": "world"}).explain();

Participants:

 Description   

When creating a compound multikey index on a string- and an array field (both nested inside the same object) the indexbounds is constrained to only the leading field in the index.

This behavior doesn't occur when the string and array fields are at the root level. During a query on both fields, both predicates are compounded in the index bound.

Eg:
Given the following structure:

{nest: {a:'hello', b: ['hello', 'world']}}

We add a compound index on `nest.a` an `nest.b`
When we query for both `nest.a` and `nest.b`, the index is not bound for the predicate `nest.b`

 db.c.find({"nest.a": 'hello', "nest.b": 'world'}).explain();
{
	"cursor" : "BtreeCursor nest.a_1_nest.b_1",
	"isMultiKey" : true,
	"n" : 1,
	"nscannedObjects" : 2,
	"nscanned" : 2,
	"nscannedObjectsAllPlans" : 2,
	"nscannedAllPlans" : 2,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"indexBounds" : {
		"nest.a" : [
			[
				"hello",
				"hello"
			]
		],
		"nest.b" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	},
	"server" : "....."
}

However, if we assume the same structure but move it out of the object both predicates are bound.

{a: 'hello', b: ['hello', 'world']}

db.c.find({"a": 'hello', "b": "world"}).explain();
{
	"cursor" : "BtreeCursor a_1_b_1",
	"isMultiKey" : true,
	"n" : 1,
	"nscannedObjects" : 1,
	"nscanned" : 1,
	"nscannedObjectsAllPlans" : 1,
	"nscannedAllPlans" : 1,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 0,
	"indexBounds" : {
		"a" : [
			[
				"hello",
				"hello"
			]
		],
		"b" : [
			[
				"world",
				"world"
			]
		]
	},
	"server" : "...."
}

Is this expected behavior? I couldn't find anything in the docs that points to limitations of nested compound multikey indexes.



 Comments   
Comment by Ramon Fernandez Marina [ 29/Jun/16 ]

wernerj101, to add to Thomas' answer, please note that you can download the 3.3.9 development release today, which contains support for path-level multikey tracking, and test if the index bounds work as you would expect.

Regards,
Ramón.

Comment by Kelsey Schubert [ 29/Jun/16 ]

Hi wernerj101,

Thanks for the report. This is expected behavior, but will be improved in MongoDB 3.4. In case when fields in the compound multikey index share a path prefix, as in

{"nest.a": 'hello', "nest.b": 'world'}

is problematic because predicates over these fields may refer to subdocuments within the same array field. As Dave describes in SERVER-6720, bounds cannot be combined when they constrain the same array field. For supplemental information, please see our documentation on Multikey Index Bounds.

In our upcoming major release, MongoDB 3.4, this issue will be resolved by SERVER-22401. Previously, tighter index bounds were not possible, because the query planner had no way to determine whether the document

{nest: [{a: 'hello'}, {b: 'world'}]}

also existed in the collection. But, we now have path-level multikey tracking and have used it to have tighter index bounds in our development branch.

Please see SERVER-15086 for a more in depth discussion of this improvement.

Thank you,
Thomas

Generated at Thu Feb 08 04:07:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.