[SERVER-26655] $gt operation on array with index Created: 17/Oct/16  Updated: 06/Dec/22  Resolved: 25/Jan/21

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: alexnikleo Assignee: Backlog - Query Optimization
Resolution: Duplicate Votes: 6
Labels: mql-semantics, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-45233 Indexed inequalities to arrays return... Closed
is duplicated by SERVER-19761 presence of multikey index changes re... Closed
is duplicated by SERVER-28720 Database search using find() with 2 i... Closed
Related
is related to SERVER-11444 $lt/$lte/$gte/$gt behaves differently... Closed
is related to SERVER-19402 Change semantics of sorting by array ... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Participants:

 Description   

Following example works incorrect
http://pastebin.com/VCWanPLZ

The first `find` fetches zero values, the second `find` fetches exactly what it should fetch.
Mongo 3.2.8, Engine: WiredTiger, OS: Ubuntu, OSX



 Comments   
Comment by David Storch [ 25/Jan/21 ]

asya apologies for the delayed response! I did a bit of digging and it looks like this was fixed and backported by ted.tuckman under SERVER-45233. The fix went into all versions of 4.4, as well as 4.2.9 and 4.0.21. I'll close this ticket as a dup of SERVER-45233.

Comment by Asya Kamsky [ 11/Nov/20 ]

I was not able to reproduce this in version 4.4 - which suggests that this was fixed between 4.0 and 4.4.

Comment by Vokail [ 22/Feb/19 ]

Any plan for fixing this ?

 I was able to reproduce the problem on Mongodb 4.0.4, Os Windows 10, storage engine Wiredtiger

Comment by Asya Kamsky [ 02/Dec/18 ]

 

jimrandomh  if you are talking about array values, then you are correct.  In current MQL semantics operations apply both to top level value as well as values inside an array, so when there is a multikey index (index on a field which contains array) then we always end up having to scan the entire index.

 

Comment by James Babcock [ 28/Nov/18 ]

As far as I can tell, reading this thread and the StackOverflow thread, there is currently no way to search for "has field and it is not empty" which is correct, works regardless of the array element type, uses an index, and avoids scanning the entire index. We're giving up and denormalizing to work around this bug.

Comment by Asya Kamsky [ 01/Jun/18 ]

Answering "does this array exist" should not be using comparison to array - that's not a valid test for that.

What would be correct are any of the following:

db.coll.find({a:{$type:"array"}})  // as of 3.6; returns true if a is an array *only*, uses an index
db.coll.find({$or:[{a:[]},{"a.0":{$exists:true}}]}) // works since 2.2, does not use an index

Comment by Niko Fink [ 22/May/18 ]

It would be great to have a documented single best way to answer the "does this array exist" question, regardless of whether an index is used. There is a tutorial on querying for missing values, but neither does it say anything about indices nor is there any mention of doing the exact opposite, which is what we actually want here. And as it seems, there is no actual way of answering this question right now as long as indices are / should be involved, but please correct me if I'm wrong here.

As the question is pretty common and the wrong answers on stackoverflow are very visible - being the first result for a "mongo array not empty" google search while the first (and also accepted and most voted) answer is wrong - could we please have some official guidance on this?

Comment by Asya Kamsky [ 22/May/18 ]

Unfortunately, the linked SO questions includes a lot of good and bad advice but fundamentally when the question is "does this array exist" the answer should never be "use $gt with array" - $gt is an operator that compares scalar values meaningfully, it's not at all clear what its semantics when comparing array values should be.

 

Comment by Niko Fink [ 17/May/18 ]

There is some discussion together with tests and currently possible solutions on Stackoverflow:
https://stackoverflow.com/a/42601244/805569

Comment by Asya Kamsky [ 26/Dec/17 ]

Additional test case that is incorrect when there's an index on "field":

> db.test_mongo_fckp.find()
{ "_id" : ObjectId("5a4293263d1700ff55c283c9"), "field" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283ca"), "field" : [ ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cb"), "field" : [ "" ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cc"), "field" : [ 0 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cd"), "field" : 1 }
{ "_id" : ObjectId("5a4293263d1700ff55c283ce"), "field_1" : 1 }
{ "_id" : ObjectId("5a4294a93d1700ff55c283cf"), "field" : undefined }
{ "_id" : ObjectId("5a4294b13d1700ff55c283d0"), "field" : [ undefined ] }
> db.test_mongo_fckp.createIndex({'field': 1})
{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
> db.test_mongo_fckp.find({'field': {'$eq': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283ca"), "field" : [ ] }
> db.test_mongo_fckp.find({'field': {'$ne': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283ce"), "field_1" : 1 }
{ "_id" : ObjectId("5a4293263d1700ff55c283cc"), "field" : [ 0 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283c9"), "field" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cd"), "field" : 1 }
{ "_id" : ObjectId("5a4293263d1700ff55c283cb"), "field" : [ "" ] }
> db.test_mongo_fckp.find({'field': {'$lte': []}})
> db.test_mongo_fckp.find({'field': {'$gte': []}})
> db.test_mongo_fckp.dropIndexes()
{
	"nIndexesWas" : 2,
	"msg" : "non-_id indexes dropped for collection",
	"ok" : 1
}
> db.test_mongo_fckp.find({'field': {'$eq': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283ca"), "field" : [ ] }
> db.test_mongo_fckp.find({'field': {'$ne': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283c9"), "field" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cb"), "field" : [ "" ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cc"), "field" : [ 0 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cd"), "field" : 1 }
{ "_id" : ObjectId("5a4293263d1700ff55c283ce"), "field_1" : 1 }
{ "_id" : ObjectId("5a4294a93d1700ff55c283cf"), "field" : undefined }
{ "_id" : ObjectId("5a4294b13d1700ff55c283d0"), "field" : [ undefined ] }
> db.test_mongo_fckp.find({'field': {'$gte': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283c9"), "field" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283ca"), "field" : [ ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cb"), "field" : [ "" ] }
{ "_id" : ObjectId("5a4293263d1700ff55c283cc"), "field" : [ 0 ] }
{ "_id" : ObjectId("5a4294b13d1700ff55c283d0"), "field" : [ undefined ] }
> db.test_mongo_fckp.find({'field': {'$lte': []}})
{ "_id" : ObjectId("5a4293263d1700ff55c283ca"), "field" : [ ] }

This is with 3.6.0 server.

I think there are two problems here, one is indexing null-ish value, but the other is doing type bracketing when "undefined" is involved in indexed "find".

Comment by Ian Whalen (Inactive) [ 14/Jul/17 ]

The plan is to consider this problem again when we start planning for 3.8.

Comment by David Storch [ 17/Oct/16 ]

Hi alexnikleo,

Thanks for reporting this issue! This is certainly a bug in that the semantics of comparison-to-array are not consistently enforced by the query system. What exactly those semantics should be is a topic that has come up before (see, for example, SERVER-11444). Our development team is going to re-open an internal discussion of comparison-to-array semantics so that we can determine what a fix would look like. Please continue to watch for updates.

Best,
Dave

Comment by Ramon Fernandez Marina [ 17/Oct/16 ]

I'm able to reproduce the behavior you describe: the find() command on line 12 does not return data, while the find() command on line 14 returns the following:

{ "_id" : ObjectId("5804c4dfb1914880dba245f0"), "field" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("5804c4dfb1914880dba245f2"), "field" : [ "" ] }
{ "_id" : ObjectId("5804c4dfb1914880dba245f3"), "field" : [ 0 ] }

Comment by Ramon Fernandez Marina [ 17/Oct/16 ]

Thanks for opening a bug report alexnikleo, I'm adding your sample code below for convenience:

1
db.test_mongo_fckp.remove({})
2
db.test_mongo_fckp.insertMany([
3
  {field: [1,2,3]},
4
  {field: []},
5
  {field: ['']},
6
  {field: [0]},
7
  {field: 1},
8
  {field_1: 1}
9
])
10
db.test_mongo_fckp.find()
11
db.test_mongo_fckp.createIndex({'field': 1})
12
db.test_mongo_fckp.find({'field': {'$gt': []}})
13
db.test_mongo_fckp.dropIndex({'field': 1})
14
db.test_mongo_fckp.find({'field': {'$gt': []}})

Generated at Thu Feb 08 04:12:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.