[SERVER-56101] Unnecessary FETCH stage when querying an array Created: 14/Apr/21  Updated: 27/Oct/23  Resolved: 20/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kay Agahd Assignee: Edwin Zhou
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

When querying an array as documented here, MonogDB needs to fetch the whole document to read the queried field although both the projected field and the queried field are in the index:

db.test2.insert({a:["1"], p:"2"})
db.test2.createIndex({a:1, p:1})
db.test2.explain(true).find({a:["1"]}, {_id:0,p:1})
"executionStats" : {
		"nReturned" : 1,
		"totalKeysExamined" : 1,
		"totalDocsExamined" : 1,
...
"stage" : "FETCH",
				"filter" : {
					"a" : {
						"$eq" : [
							"1"
						]
					}
				},

The same applies when the $elemMatch operator is used:

db.test2.explain(true).find({a:{$elemMatch:{$eq:"1"}}}, {_id:0,p:1})
"executionStats" : {
		"nReturned" : 1,
		"totalKeysExamined" : 1,
		"totalDocsExamined" : 1,
...
"stage" : "FETCH",
				"filter" : {
					"a" : {
						"$elemMatch" : {
							"$eq" : "1"
						}
					}
				},

However, using the $all operator does not require the FETCH stage:

db.test2.explain(true).find({a:{$all:["1"]}}, {_id:0,p:1})
"executionStats" : {
		"nReturned" : 1,
		"totalKeysExamined" : 1,
		"totalDocsExamined" : 0,

The same is true for the standard way to query a single element in an array:

db.test2.explain(true).find({a:"1"}, {_id:0,p:1})
"executionStats" : {
		"nReturned" : 1,
		"totalKeysExamined" : 1,
		"totalDocsExamined" : 0,

Participants:

 Description   

When querying an array as documented here, MonogDB may need to fetch the whole document to read the queried field although both the projected field and the queried field are in the index.

This is an undesired behaviour because it fetches uselessly documents from the disk which makes the query slow (especially, if documents are bigger).



 Comments   
Comment by Kay Agahd [ 25/Apr/21 ]

Hi edwin.zhou,
thank you for the helpful links!
You are right that the documentation states that

Multikey indexes cannot cover queries over array field(s).

However, as my last two examples show, there are obviously exceptions to the rule, so it would be great if the documentation could be clearer about how multikey indexes may cover a query over array fields. Just saying:

multikey indexes can cover queries over the non-array fields if the index tracks which field or fields cause the index to be multikey.

is not enough because it does not explain how the query and index have to be written in order to be covered by the multikey index over the array field.

Thanks for your consideration.

Comment by Edwin Zhou [ 20/Apr/21 ]

Hi kay.agahd@idealo.de,

I greatly appreciate your detailed reproduction and description of this issue. Multikey indexes cannot cover queries over array fields as documented here.

A more detailed explanation by Asya can be found on SERVER-35223.

Best,
Edwin

Generated at Thu Feb 08 05:38:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.