[SERVER-37246] $arrayElemAt aggregation, referencing property within an array element object, indexes don't include elements without that property. Created: 21/Sep/18  Updated: 27/Oct/23  Resolved: 01/Oct/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.6.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sam Miller Assignee: Nick Brewer
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to DOCS-12068 [Server] Add more detailed section/pa... External Review
Operating System: ALL
Steps To Reproduce:

db.test.insert(
   {
      "_id" : ObjectId("5ba4d6ca79f29bad7b3fa913"),
      "firstArray": [
         {
         },
         {
            "secondArray": ["a6", "a7", "a8", "a9", "a10"]
         },
         {
            "secondArray": ["a11", "a12", "a13", "a14", "a15"]
         },
      ]
   }
)

 

Then run the following aggregation:

db.test.aggregate([
   {
      "$match": {
         "_id": ObjectId("5ba4d6ca79f29bad7b3fa913")
{{      }}}
   },
   {
      "$project": {
         "array": {
            "$arrayElemAt": ["$firstArray.secondArray", 0]
         }
{{      }}}
   }
])

Returns:

{{{}}
   "array": ["a6", "a7", "a8", "a9", "a10"]
}

This is confusing. I would expect it to return:

{{{}}
   "array": null
}

or whatever mongo uses to indicate the aggregate returns nothing.

 

The reason I expect it to return nothing, is because the first element of firstArray }}doesn't contain a property {{secondArray, so the index of 0 should first be selecting the first element of firstArray (i.e., ,) and then return secondArray which doesn't exit.

 

What is actually happening is mongo is getting all the array elements that contain secondArray and then applying the index to that array. Hence why }}{{"$arrayElemAt": ["$firstArray.secondArray", 2] returns nothing, and not the last element in firstArray.

 

Am I missing something? The documentation doesn't make it clear that this is the expected behaviour. Either the documentation should be updated to make it clear what the index supplied to $arrayElemAt actually applies to or this bug should be fixed.

Many thanks!

 

 

 

 

 

Participants:

 Description   

When using the $arrayElemAt to select an array element of an array and to then include an property within that array element object, I've noticed that the index only applies to elements of the array that include the property. It's easier if I explain this with an example so please see the Steps to reproduce.



 Comments   
Comment by Sam Miller [ 08/Oct/18 ]

Hi,

Yeh, that makes sense. Updating the documentation will help! Thanks!

Comment by Nick Brewer [ 01/Oct/18 ]

sjom106 I'm closing this ticket in favor of DOCS-12068. Let us know if you have any questions.

-Nick

Comment by Nick Brewer [ 21/Sep/18 ]

sjom106 I'm curious what reason there would be to specify a nested array such as ["$firstArray.secondArray", 0] if it is expected that this will always be treated identically to ["$firstArray", 0]. If the nested field was not considered against the index, you would need to know exactly where it exists within an array in order to project it out (assuming you're not combining $arrayElemAt with $filter).

Take for example the following data:

{
	"_id" : ObjectId("5ba51cb9123af44066ec85c2"),
	"example" : [
		{
 
		},
		{
 
		},
		{
			"apples" : [
				"macintosh",
				"golden delicious",
				"fuji",
				"gala"
			]
		},
		{
 
		},
		{
 
		},
		{
			"oranges" : [
				"mandarin",
				"navel",
				"clementine"
			]
		},
		{
			"apples" : [
				"braeburn",
				"honeycrisp",
				"red"
			]
		}
	]
}

If you only wanted to project out the second occurrence of "apples" within the "example" array - currently you could accomplish this via:

{
   "$project": { 
      "apples2": { 
         "$arrayElemAt": ["$example.apples", 1] 
       }
   }
}

Whereas if you know the exact placement of the second "apples" array within the "example" array, you're still able to use its exact placement to project it:

db.test.aggregate([
   {
      "$project": {
         "apples2": {
            "$arrayElemAt": ["$example", 6]
         }
            }
   }
])

If the nested field was not considered against the index value, only the latter method would work.

I definitely agree that the documentation for $arrayElemAt could use clarification on how nested fields are handled. I've opened a separate documentation ticket (DOCS-12068) to track that work.

-Nick

Comment by Sam Miller [ 21/Sep/18 ]

Hi, thanks for taking a look. I realise that it's because it's matching the first nested array, but my point is that it's confusing.

The following returns the first element of firstArray:

{
   "$project": { 
      "array": { 
         "$arrayElemAt": ["$firstArray", 0] 
       }
   }
}

but as soon as a nested property is referenced, it returns the second element of firstArray (just because the first element doesn't have a property called secondArray):

{
   "$project": { 
      "array": { 
         "$arrayElemAt": ["$firstArray.secondArray", 0] 
      }
   }
}

The index hasn't changed, it is now applying to a different "array", an array made up of only elements that have a propery called secondArray. Is that how this feature is supposed to work? It seems inconsistent to me, and confusing. 

It would be great if the index supplied always refers to the original array, and not the array after any elements have been removed because of nested properties that it doesn't have.

If that can't be done, could the documentation be updated to make it clear that is what happens? Even then I would argue that is a confusing way for this feature to work, but I am probably missing something! Thanks.

 

 

 

Comment by Nick Brewer [ 21/Sep/18 ]

sjom106 You're receiving this result because you're specifying firstArray.secondArray with the integer 0, so the first nested array within firstArray that matches secondArray is used. If you did not specify the nested array, and simply used:

db.test.aggregate([
   {
      "$match": {
         "_id": ObjectId("5ba4d6ca79f29bad7b3fa913")
             }
   },
   {
      "$project": {
         "array": {
            "$arrayElemAt": ["$firstArray", 0]
         }
            }
   }
])

You would receive the empty result from the first array:

{ "_id" : ObjectId("5ba4d6ca79f29bad7b3fa913"), "array" : {  } }

-Nick

Comment by Sam Miller [ 21/Sep/18 ]

Sorry, I seemed to have messed up some of the formatting. The second aggregate to run should be:

db.test.aggregate([
   {
      "$match": {
         "_id": ObjectId("5ba4d6ca79f29bad7b3fa913")
             }
   },
   {
      "$project": {
         "array": {
            "$arrayElemAt": ["$firstArray.secondArray", 0]
         }
            }
   }
])

Returns:

{
   "array": ["a6", "a7", "a8", "a9", "a10"]
}

This is confusing. I would expect it to return:

{
   "array": null
}

 

 

 

Generated at Thu Feb 08 04:45:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.