[SERVER-28720] Database search using find() with 2 indexes of heirarchical sub-documents wrongly return empty cursor Created: 10/Apr/17  Updated: 15/May/17  Resolved: 15/May/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.2.12
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Samaresh Singh Assignee: Asya Kamsky
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-26655 $gt operation on array with index Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

The data:

MongoDB Enterprise > db.coll2.find()
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }
{ "_id" : ObjectId("58e6cfb6adef354b4b538a03"), "name" : "jam", "a" : { "c" : [ ] } }
{ "_id" : ObjectId("58e6cfbdadef354b4b538a04"), "name" : "jam", "a" : { "b" : [ ] } }
{ "_id" : ObjectId("58e700b8adef354b4b538a05"), "name" : "jam2" }
{ "_id" : ObjectId("58e701edadef354b4b538a06"), "name" : "jam3", "a" : 34 }
{ "_id" : ObjectId("58e70244adef354b4b538a07"), "name" : "jam3", "a" : { "b" : 34 } }

The indexes:

MongoDB Enterprise > db.coll2.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "test.coll2"
	},
	{
		"v" : 1,
		"key" : {
			"a.b.0" : 1
		},
		"name" : "a.b.0_1",
		"ns" : "test.coll2"
	},
	{
		"v" : 1,
		"key" : {
			"a.c.0" : 1
		},
		"name" : "a.c.0_1",
		"ns" : "test.coll2"
	}
]

WRONG Result of search (looking non empty array entr(ies) in the indexed sub-documents) :

MongoDB Enterprise > db.coll2.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] })
MongoDB Enterprise >

The result returns 0 entries.

If remove the indexes using db.coll2.dropIndexes() and conduct the search again I get the RIGHT result as :

MongoDB Enterprise > db.coll2.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] })
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }

Participants:

 Description   

The data:

MongoDB Enterprise > db.coll2.find()
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }
{ "_id" : ObjectId("58e6cfb6adef354b4b538a03"), "name" : "jam", "a" : { "c" : [ ] } }
{ "_id" : ObjectId("58e6cfbdadef354b4b538a04"), "name" : "jam", "a" : { "b" : [ ] } }
{ "_id" : ObjectId("58e700b8adef354b4b538a05"), "name" : "jam2" }
{ "_id" : ObjectId("58e701edadef354b4b538a06"), "name" : "jam3", "a" : 34 }
{ "_id" : ObjectId("58e70244adef354b4b538a07"), "name" : "jam3", "a" : { "b" : 34 } }

The indexes:

MongoDB Enterprise > db.coll2.getIndexes()
[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "test.coll2"
	},
	{
		"v" : 1,
		"key" : {
			"a.b.0" : 1
		},
		"name" : "a.b.0_1",
		"ns" : "test.coll2"
	},
	{
		"v" : 1,
		"key" : {
			"a.c.0" : 1
		},
		"name" : "a.c.0_1",
		"ns" : "test.coll2"
	}
]

WRONG Result of search (looking non empty array entr(ies) in the indexed sub-documents) :

MongoDB Enterprise > db.coll2.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] })
MongoDB Enterprise >

The results returns 0 entries.

If remove the indexes using db.coll2.dropIndexes() and conduct the search again I get the RIGHT result as :

MongoDB Enterprise > db.coll2.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] })
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }



 Comments   
Comment by Asya Kamsky [ 17/Apr/17 ]

samaresh.singh you are describing a case where presence of a multikey index (an index on an array) changes the results of a query that's comparing to an array. That's precisely what the other two tickets (which are probably fundamentally the same) are also describing.

Comment by Samaresh Singh [ 14/Apr/17 ]

I believe this one is different from SERVER-26655 and SERVER-19761 because in the current case where we have 2 fields "a.b" and "a.c" that have array entries and just have index on any one of the array fields then it does not cause any problem. In other words, having either the index "a.b" OR "a.c" produces the right result. Moreover, in a way this is different, when I created "a.b.0" and "a.c.0" i.e indexes on the first item of both the array fields I got the correct result. It is when I created index in just "a.b" AND "a.c" that the error happened

I tested with 3 sub-document arrays i.e. "a.b", "a.c" and "a.d" and then created indexes using all those sub-document array fields, I got the error but NOT if created index using up to 2 of those array fields i.e. n -1 array fields. I extended the test using 4 and 5 array fields and found that if I have indexes on (n-1) of the array fields I get the correct result. Moreover, if I create indexes using the first element of all the array fields ("a.b.0", "a.c.0", etc) then I get the correct result as well. I get error only if I create indexes using all the array fields.

Comment by Samaresh Singh [ 10/Apr/17 ]

Hi Mark,

Yes, I have also replicated the issue on 3.4.2 earlier as well.

Sincerely
Samaresh

Comment by Samaresh Singh [ 10/Apr/17 ]

Hi Mark,

Unfortunately your index is not the same as mine. I also fixed the issue using the index that you are also showing, where the index is based on the first element of the sub-document i.e. "a.b.0" and "a.c.0" But in the original problem the indexes that I describe are on "a.b" and "a.c" only and not on the first element of the arrays. There is a big difference there. I can use the index here on the first element of the arrays and get away with this use-case but it is a clear bug. BTW mentioned below is the explain(true) output:

MongoDB Enterprise > db.coll2.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] }).explain(true)  which shows "nReurned" value as 0.

{
	"queryPlanner" : {
		"plannerVersion" : 1,
		"namespace" : "test.coll2",
		"indexFilterSet" : false,
		"parsedQuery" : {
			"$or" : [
				{
					"a.b" : {
						"$gt" : [ ]
					}
				},
				{
					"a.c" : {
						"$gt" : [ ]
					}
				}
			]
		},
		"winningPlan" : {
			"stage" : "SUBPLAN",
			"inputStage" : {
				"stage" : "OR",
				"inputStages" : [
					{
						"stage" : "FETCH",
						"filter" : {
							"$or" : [
								{
									"a.b" : {
										"$gt" : [ ]
									}
								}
							]
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"keyPattern" : {
								"a.b" : 1
							},
							"indexName" : "a.b_1",
							"isMultiKey" : true,
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 1,
							"direction" : "forward",
							"indexBounds" : {
								"a.b" : [
									"([], BinData(0, ))"
								]
							}
						}
					},
					{
						"stage" : "FETCH",
						"filter" : {
							"$or" : [
								{
									"a.c" : {
										"$gt" : [ ]
									}
								}
							]
						},
						"inputStage" : {
							"stage" : "IXSCAN",
							"keyPattern" : {
								"a.c" : 1
							},
							"indexName" : "a.c_1",
							"isMultiKey" : false,
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 1,
							"direction" : "forward",
							"indexBounds" : {
								"a.c" : [
									"([], BinData(0, ))"
								]
							}
						}
					}
				]
			}
		},
		"rejectedPlans" : [ ]
	},
	"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 0,
		"executionTimeMillis" : 0,
		"totalKeysExamined" : 0,
		"totalDocsExamined" : 0,
		"executionStages" : {
			"stage" : "SUBPLAN",
			"nReturned" : 0,
			"executionTimeMillisEstimate" : 0,
			"works" : 2,
			"advanced" : 0,
			"needTime" : 1,
			"needYield" : 0,
			"saveState" : 0,
			"restoreState" : 0,
			"isEOF" : 1,
			"invalidates" : 0,
			"inputStage" : {
				"stage" : "OR",
				"nReturned" : 0,
				"executionTimeMillisEstimate" : 0,
				"works" : 2,
				"advanced" : 0,
				"needTime" : 1,
				"needYield" : 0,
				"saveState" : 0,
				"restoreState" : 0,
				"isEOF" : 1,
				"invalidates" : 0,
				"dupsTested" : 0,
				"dupsDropped" : 0,
				"locsForgotten" : 0,
				"inputStages" : [
					{
						"stage" : "FETCH",
						"filter" : {
							"$or" : [
								{
									"a.b" : {
										"$gt" : [ ]
									}
								}
							]
						},
						"nReturned" : 0,
						"executionTimeMillisEstimate" : 0,
						"works" : 1,
						"advanced" : 0,
						"needTime" : 0,
						"needYield" : 0,
						"saveState" : 0,
						"restoreState" : 0,
						"isEOF" : 1,
						"invalidates" : 0,
						"docsExamined" : 0,
						"alreadyHasObj" : 0,
						"inputStage" : {
							"stage" : "IXSCAN",
							"nReturned" : 0,
							"executionTimeMillisEstimate" : 0,
							"works" : 1,
							"advanced" : 0,
							"needTime" : 0,
							"needYield" : 0,
							"saveState" : 0,
							"restoreState" : 0,
							"isEOF" : 1,
							"invalidates" : 0,
							"keyPattern" : {
								"a.b" : 1
							},
							"indexName" : "a.b_1",
							"isMultiKey" : true,
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 1,
							"direction" : "forward",
							"indexBounds" : {
								"a.b" : [
									"([], BinData(0, ))"
								]
							},
							"keysExamined" : 0,
							"dupsTested" : 0,
							"dupsDropped" : 0,
							"seenInvalidated" : 0
						}
					},
					{
						"stage" : "FETCH",
						"filter" : {
							"$or" : [
								{
									"a.c" : {
										"$gt" : [ ]
									}
								}
							]
						},
						"nReturned" : 0,
						"executionTimeMillisEstimate" : 0,
						"works" : 1,
						"advanced" : 0,
						"needTime" : 0,
						"needYield" : 0,
						"saveState" : 0,
						"restoreState" : 0,
						"isEOF" : 1,
						"invalidates" : 0,
						"docsExamined" : 0,
						"alreadyHasObj" : 0,
						"inputStage" : {
							"stage" : "IXSCAN",
							"nReturned" : 0,
							"executionTimeMillisEstimate" : 0,
							"works" : 1,
							"advanced" : 0,
							"needTime" : 0,
							"needYield" : 0,
							"saveState" : 0,
							"restoreState" : 0,
							"isEOF" : 1,
							"invalidates" : 0,
							"keyPattern" : {
								"a.c" : 1
							},
							"indexName" : "a.c_1",
							"isMultiKey" : false,
							"isUnique" : false,
							"isSparse" : false,
							"isPartial" : false,
							"indexVersion" : 1,
							"direction" : "forward",
							"indexBounds" : {
								"a.c" : [
									"([], BinData(0, ))"
								]
							},
							"keysExamined" : 0,
							"dupsTested" : 0,
							"dupsDropped" : 0,
							"seenInvalidated" : 0
						}
					}
				]
			}
		},
		"allPlansExecution" : [ ]
	},
	"serverInfo" : {
		"host" : "database",
		"port" : 27017,
		"version" : "3.2.12",
		"gitVersion" : "ef3e1bc78e997f0d9f22f45aeb1d8e3b6ac14a14"
	},
	"ok" : 1
}

Sincerely
Samaresh

Comment by Mark Agarunov [ 10/Apr/17 ]

Hello samaresh.singh,

Thank you for the report. Unfortunately I'm unable to reproduce the behavior you've described. Repeating the steps you outlined with the same data appears to give the proper result with the indexes:

> db.test.find()
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }
{ "_id" : ObjectId("58e6cfb6adef354b4b538a03"), "name" : "jam", "a" : { "c" : [ ] } }
{ "_id" : ObjectId("58e6cfbdadef354b4b538a04"), "name" : "jam", "a" : { "b" : [ ] } }
{ "_id" : ObjectId("58e700b8adef354b4b538a05"), "name" : "jam2" }
{ "_id" : ObjectId("58e701edadef354b4b538a06"), "name" : "jam3", "a" : 34 }
{ "_id" : ObjectId("58e70244adef354b4b538a07"), "name" : "jam3", "a" : { "b" : 34 } }
> db.test.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "test.test"
	},
	{
		"v" : 2,
		"key" : {
			"a.b.0" : 1
		},
		"name" : "a.b.0_1",
		"ns" : "test.test"
	},
	{
		"v" : 2,
		"key" : {
			"a.c.0" : 1
		},
		"name" : "a.c.0_1",
		"ns" : "test.test"
	}
]
> db.test.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] })
{ "_id" : ObjectId("58e6cf4badef354b4b538a00"), "name" : "jam", "a" : { "b" : [ 1 ], "c" : [ 11 ] } }
{ "_id" : ObjectId("58e6cf77adef354b4b538a01"), "name" : "jam", "a" : { "b" : [ 2, 3 ] } }
{ "_id" : ObjectId("58e6cf83adef354b4b538a02"), "name" : "jam", "a" : { "c" : [ 12 ] } }

Please run the query where you are seeing this issue with .explain(true) appended and provide the output, for example:

db.test.find({$or :[{"a.b" : {$gt : []}}, {"a.c" :{$gt : []}}] }).explain(true)

This should give a better idea of how mongod is running the query.

Thanks,
Mark

Generated at Thu Feb 08 04:18:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.