[SERVER-39567] Query with min/max modifier uses hashed index Created: 13/Feb/19  Updated: 29/Oct/23  Resolved: 12/Apr/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 4.0.4
Fix Version/s: 4.1.11

Type: Bug Priority: Critical - P2
Reporter: Artem Assignee: Ian Boros
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by DRIVERS-653 Require hint for min/max find options Closed
Documented
is documented by DOCS-12635 Docs for SERVER-39567: Query with min... Closed
Problem/Incident
causes PYTHON-1811 Raise a deprecation warning for min/m... Closed
Related
related to CDRIVER-3099 Specify hint with min/max tests for M... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Steps To Reproduce:
  1. Install standalone replicaset (node-1);
  2. Create collection with hashed _id index:

rs0:PRIMARY> use foo
switched to db foo
rs0:PRIMARY> db.bar.insert({_id: "test"})
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> db.bar.createIndex({_id: "hashed"})
{
	"createdCollectionAutomatically" : true,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1,
        ...
}
rs0:PRIMARY> db.bar.getIndices()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "bar.foo"
	},
	{
		"v" : 2,
		"key" : {
			"_id" : "hashed"
		},
		"name" : "_id_hashed",
		"ns" : "bar.foo"
	}
]
rs0:PRIMARY> db.bar.find().min({_id: ""})
{ "_id" : "test" }

  1. Create empty replicaset member (node-2) and add to replicaset;
  2. Stepdown `node-1`;
  3. Check query (expected one record, but actually empty resultset):

rs0:PRIMARY> db.bar.find().min({_id: ""})
rs0:PRIMARY>  

Sprint: Query 2019-04-22
Participants:

 Description   

Issue

Query with min/max modifier uses first index with related fields.

On fresh MongoDB secondary startup `id` index became last in order (this looks like a bug) and `find` operation with `min`/`max` modifier starts using first hashed index by `_id`.

When this secondary became primary, this shard returns incomplete results on query.

user:PRIMARY> db.socialTopics.getIndices()
[
	{
		"v" : 2,
		"key" : {
			"_id" : "hashed"
		},
		"name" : "_id_hashed",
		"background" : true,
		"ns" : "user.socialTopics"
	},
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "user.socialTopics"
	}
]
user:PRIMARY> db.socialTopics.find({}, {_id: 1}).min({_id: ObjectId("5bb74d8cefa37173fe2cf692")}).sort({_id: 1}).limit(5).explain()
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "user.socialTopics",
		"indexFilterSet" : false,
		"parsedQuery" : {
			
		},
		"winningPlan" : {
			"stage" : "PROJECTION",
			"transformBy" : {
				"_id" : 1
			},
			"inputStage" : {
				"stage" : "SORT",
				"sortPattern" : {
					"_id" : 1
				},
				"limitAmount" : 5,
				"inputStage" : {
					"stage" : "SORT_KEY_GENERATOR",
					"inputStage" : {
						"stage" : "FETCH",
						"inputStage" : {
							"stage" : "IXSCAN",
							"keyPattern" : {
								"_id" : "hashed"
							},
							"indexName" : "_id_hashed",
							"isMultiKey" : false,
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 2,
							"direction" : "forward",
							"indexBounds" : {
								
							}
						}
					}
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	...
} 

Why we use min/max modifier

We use min/max modifier for iterating over collections, because MongoDB don't have operation like `after` or `before`.

We can't use `$gt`/`$gte`/`$lt`/`$lte` because:

  1. this operations always false for values with differ types;
  2. it's not trivial to make query with multiple field bounds.

Workaround

Looks like with `hint` we can set correct index directly.



 Comments   
Comment by Githook User [ 11/Apr/19 ]

Author:

{'name': 'Ian Boros', 'username': 'puppyofkosh', 'email': 'puppyofkosh@gmail.com'}

Message: SERVER-39567 Change find min/max options to require hint
Branch: master
https://github.com/mongodb/mongo/commit/502df279c7476c01758ab210728f4acc4a27a218

Comment by Artem [ 07/Mar/19 ]

Interesting trick, but it don't solve pagination issue:

  1. I can't use `$expr` for pagination by more then one field;
  2. It don't affect index bounds.
Comment by Asya Kamsky [ 07/Mar/19 ]

bozaro 

Continuing on capabilities of $expr and aggregation semantics, regarding your use case 2

> sort after tuple need really complex expression

 

You can achieve that with $expr as this (if sorting on a, b, c):

db.order.find({$expr:{$gt:[{a:"$a", b:"$b", c:"$c"}, {a:1,b:1,"c":"20"} ]}}).sort({a:1, b:1, c:1})
 

This is if the last values on the last page are 1, 1, "20".

Full example:

db.order.find().sort({a:1, b:1, c:1})
{ "_id" : ObjectId("5c8139442deebe452ba8c369"), "a" : null, "b" : "0" }
{ "_id" : ObjectId("5c81393a2deebe452ba8c367"), "a" : 0, "b" : 0 }
{ "_id" : ObjectId("5c8139402deebe452ba8c368"), "a" : 0, "b" : "0" }
{ "_id" : ObjectId("5c8139362deebe452ba8c366"), "a" : 1, "b" : 0 }
{ "_id" : ObjectId("5c8139342deebe452ba8c365"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("5c8139312deebe452ba8c364"), "a" : 1, "b" : 1, "c" : 1 }
{ "_id" : ObjectId("5c8139612deebe452ba8c36a"), "a" : 1, "b" : 1, "c" : 2 }
{ "_id" : ObjectId("5c8139632deebe452ba8c36b"), "a" : 1, "b" : 1, "c" : 20 }
{ "_id" : ObjectId("5c8139692deebe452ba8c36c"), "a" : 1, "b" : 1, "c" : "20" }
{ "_id" : ObjectId("5c81396c2deebe452ba8c36d"), "a" : 1, "b" : "1", "c" : "20" }
{ "_id" : ObjectId("5c8138a82deebe452ba8c362"), "a" : {  } }
{ "_id" : ObjectId("5c8138892deebe452ba8c360"), "a" : { "a" : 1 } }
{ "_id" : ObjectId("5c8138b12deebe452ba8c363"), "a" : { "a" : 1, "b" : null } }
{ "_id" : ObjectId("5c81387a2deebe452ba8c35d"), "a" : { "a" : 1, "b" : 1 } }
{ "_id" : ObjectId("5c8138842deebe452ba8c35f"), "a" : { "a" : 1, "b" : "1" } }
{ "_id" : ObjectId("5c81388e2deebe452ba8c361"), "a" : { "b" : 1 } }
{ "_id" : ObjectId("5c8138812deebe452ba8c35e"), "a" : { "a" : "1", "b" : 1 } }
db.order.find({$expr:{$gt:[{a:"$a", b:"$b", c:"$c"}, {a:1,b:1,"c":"20"} ]}}).sort({a:1, b:1, c:1})
{ "_id" : ObjectId("5c81396c2deebe452ba8c36d"), "a" : 1, "b" : "1", "c" : "20" }
{ "_id" : ObjectId("5c8138a82deebe452ba8c362"), "a" : {  } }
{ "_id" : ObjectId("5c8138892deebe452ba8c360"), "a" : { "a" : 1 } }
{ "_id" : ObjectId("5c8138b12deebe452ba8c363"), "a" : { "a" : 1, "b" : null } }
{ "_id" : ObjectId("5c81387a2deebe452ba8c35d"), "a" : { "a" : 1, "b" : 1 } }
{ "_id" : ObjectId("5c8138842deebe452ba8c35f"), "a" : { "a" : 1, "b" : "1" } }
{ "_id" : ObjectId("5c81388e2deebe452ba8c361"), "a" : { "b" : 1 } }
{ "_id" : ObjectId("5c8138812deebe452ba8c35e"), "a" : { "a" : "1", "b" : 1 } }
db.order.explain().find({$expr:{$gt:[{a:"$a", b:"$b", c:"$c"}, {a:1,b:1,"c":"20"} ]}}).sort({a:1, b:1, c:1})
{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.order",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$expr" : {
				"$gt" : [
					{
						"a" : "$a",
						"b" : "$b",
						"c" : "$c"
					},
					{
						"$const" : {
							"a" : 1,
							"b" : 1,
							"c" : "20"
						}
					}
				]
			}
		},
		"winningPlan" : {
			"stage" : "FETCH",
			"filter" : {
				"$expr" : {
					"$gt" : [
						{
							"a" : "$a",
							"b" : "$b",
							"c" : "$c"
						},
						{
							"$const" : {
								"a" : 1,
								"b" : 1,
								"c" : "20"
							}
						}
					]
				}
			},
			"inputStage" : {
				"stage" : "IXSCAN",
				"keyPattern" : {
					"a" : 1,
					"b" : 1,
					"c" : 1
				},
				"indexName" : "a_1_b_1_c_1",
				"isMultiKey" : false,
				"multiKeyPaths" : {
					"a" : [ ],
					"b" : [ ],
					"c" : [ ]
				},
				"isUnique" : false,
				"isSparse" : false,
				"isPartial" : false,
				"indexVersion" : 2,
				"direction" : "forward",
				"indexBounds" : {
					"a" : [
						"[MinKey, MaxKey]"
					],
					"b" : [
						"[MinKey, MaxKey]"
					],
					"c" : [
						"[MinKey, MaxKey]"
					]
				}
			}
		},
		"rejectedPlans" : [ ]
	},
	"serverInfo" : {
		"host" : "Asyas-MacBook-Pro-2.local",
		"port" : 27017,
		"version" : "4.0.1",
		"gitVersion" : "54f1582fc6eb01de4d4c42f26fc133e623f065fb"
	},
	"ok" : 1
}

Comment by Asya Kamsky [ 07/Mar/19 ]

bozaro please note that in aggregation expressions, there is no type bracketing and therefore if you use aggregation

{$expr:{$gt:["$a",""]}}

this will match all values of "a" that are > "" whether or not they are a string (using the bson comparison order specified here.

As of 3.6 you can achieve aggregation semantics in find via $expr syntax.

I understand the current workaround providing a hint is working for you, but I'm mentioning the aggregation semantics for completeness. I believe it should work the way you describe your use case, but let us know if it would not. I do want to mention that since indexes serve queries with type bracketing semantics, using $expr for $gt/$lt expressions will not use the index in order to get correct complete results.

Comment by Artem [ 07/Mar/19 ]

Actually we use `min()`/`max()` as very ugly workaround.

Originally we need simple pagination feature, but we can't use `$gt`/`$gte`/`$lt`/`$lte` for this purpose:

  1. `$gt` actually mean: greater and have some type, but we have differ field types in some column (null, string, ObjectId);
  2. sort after tuple need really complex expression like `(a $after a0) or ((a $eq a0) and (b $after b0)) or ((a $eq a0) and (b $eq b0) and (c $after c0))` instead of `(a, b, c) $after (a0, b0, c0)`.

Pagination workaround with `min()`/`max()` the only solution found, but:

  1. `min()`/`max()` need exactly same field list as in index;
  2. we should parse query for calculation both `min()`/`max()` bounds (without `max()` query would read until end of collection without regard `find` modifiers;
  3. we should respect index order; 
  4. `min()`/`max()` is not symmetric modified (one of them is including, but other is excluding).
Comment by Юрий Соколов [ 07/Mar/19 ]

My friend demonstrated me that "range of partition hashes" works currently. I apologize.

Comment by Юрий Соколов [ 07/Mar/19 ]

Sorry, I will be rude.

Does min max currently work as "range of partition hashes" ? If not then why the hell you want it to work this way? Why you want to add almost meaningless excess semantic to simple thing and make it more complex for no reason?

If some one really needs "range of partition hashes", then lets add it as a separate construction instead of everloading "min/max".

Comment by David Storch [ 06/Mar/19 ]

asya and I spoke in person. We discussed two high-level alternatives for how to address this ticket:

  1. Think of min()/max() as part of MongoDB's logical query language. This would mean that the result set matching min()/max() would not change if indices are built or dropped. It would also mean that "hashed" indices would not be eligible to answer a min()/max() query, as I described above. Going this route would mean that users would no longer always have a way to express shard key ranges in order to, for example, gather all the documents which constitute a particular chunk on a sharded system. If we choose this alternative, then we could eventually make min()/max() available as part of the match language, rather than being special options to the find command separate from the query predicate.
  2. Think of min()/max() as a special kind of hint, rather than part of the query language proper. This is consistent with how min() and max() are implemented today. This alternative would not break users' ability to express chunk ranges. In order to resolve the issue described by this ticket, we would require any find command with min()/max() to also specify hint(). If there are multiple indices to which the min()/max() could apply, the hint() is needed to disambiguate. With this fix in place, the meaning of the min()/max() would also become evident from the query itself. Right now, find().min({_id: 0}).max({_id: 10}) could either refer to the range [0, 10) of raw _id values, or to a range of hashes. Requiring the user to write down either find().min({_id: 0}).max({_id: 10}).hint({_id: 1}) or find().min({_id: 0}).max({_id: 10}).hint({_id: "hashed"}) clarifies the meaning of the min()/max(). Like option #1, this is a breaking change. However, we believe that many users of min()/max(), including internal uses, already specify hint(). Furthermore, we believe that having both a regular {_id: 1} index and a {_id: "hashed"} index for hashed sharding is a common use case. Therefore, it is important to require the hint() in order to ensure that users of min()/max() with a {_id: "hashed"} shard key will not experience incorrect query results.

We believe that option #2 is both easier to implement, and superior in that it doesn't remove any pre-existing functionality. With Asya's final approval, the future assignee of this ticket should implement fix #2 and request corresponding documentation changes. This should be part of the 4.2 compatibility notes, since users of min()/max() without hint() will have to add the hint() before upgrading to 4.2. I'm putting this ticket back into "Needs Scheduling" so that it can be triaged by the Query Team.

Comment by David Storch [ 22/Feb/19 ]

funny_falcon, thanks for the excellent issue report! Your workaround of using a hint() seems appropriate until this problem can be resolved on our end.

Here is an even simpler demonstration of the issue:

MongoDB Enterprise > db.c.drop()
MongoDB Enterprise > db.c.insert({a: "test"})
 
// When only the index {a: 1} exists, min works as expected.
MongoDB Enterprise > db.c.createIndex({a: 1})
MongoDB Enterprise > db.c.find().min({a: ""})
{ "_id" : ObjectId("5c6efc2adcdcfd401ad03429"), "a" : "test" }
MongoDB Enterprise > db.c.dropIndexes()
 
// When only the index {a: "hashed"} exists, min does not work as expected. The current implementation
// expects the user to provide a range of hash values.
MongoDB Enterprise > db.c.createIndex({a: "hashed"})
MongoDB Enterprise > db.c.find().min({a: ""})
MongoDB Enterprise > // No results.

The result set associated with the min() and max() operators currently depends on the type of index selected to answer the query. The easiest solution to this problem, as funny_falcon hinted at above, would be to make indices which store a function of the data ("hashed", "2d", "2dsphere", etc.) ineligible for servicing a min()/max() query.

The one potential problem I see with this fix is that it would prevent users from issuing queries which specify ranges of hashes. Suppose, for example, that you wish to write application code which partitions your data into n roughly equal-size partitions (where the size is given by the number of documents in the partition). Also suppose that you want to partition your data by the field f. This could be achieved by building the index {f: "hashed"}, and then evenly dividing the space of hashes into n ranges. Where [imin, imax) denotes the range of hash values for the ith partition, the data in this partition would be returned by the query below:

db.collection.find().min({f: imin}).max({f: imax});

Similarly, users would no longer be able to query by a range of geohashes for geospatial indices, but the use case for this seems much less compelling. Finally, min()/max() queries which specify hash or geohash ranges are useful for running queries that return the contents of particular chunks on a sharded system. Without min()/max() the query which would return only the contents of a chunk is not always expressible.

asya, I'd like to discuss with you whether it would be acceptable to pursue the simple solution in which we prevent min()/max() from using a hashed or geospatial index. If we build this solution, would we have to give users a new way to express ranges of index keys directly?

Comment by Kelsey Schubert [ 13/Feb/19 ]

Thanks for the report and clear reproduction, bozaro, and for the initial investigation of the issue, funny_falcon. I can reproduce this behavior and we're looking into it.

Comment by Юрий Соколов [ 13/Feb/19 ]

And second reason, CollectionBulkLoaderImpl::init initiates _id index after secondaries.

Comment by Юрий Соколов [ 13/Feb/19 ]

4.0 and master branches looks to be both affected.

Comment by Юрий Соколов [ 13/Feb/19 ]

Looks like one of the bug's reasons is that indexCompatibleMaxMin doesn't skip INDEX_HASHED indexes.

Probably it should skip all index types except INDEX_BTREE, not only INDEX_WILDCARD as in master.

Generated at Thu Feb 08 04:52:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.