[SERVER-32839] Aggregation sorts on multiple array fields produce error in 3.6 Created: 22/Jan/18  Updated: 27/Oct/23  Resolved: 24/Jan/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.6.0, 3.7.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Steven Benjamin Assignee: Kelsey Schubert
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File bug.md    
Issue Links:
Related
related to DOCS-11241 Update Compatibility Changes for 3.6:... Closed
related to SERVER-32859 Improve error message when sorting pa... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Sample dataset:

{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd41417") }, "creationDate" : ISODate("2018-01-22T18:37:13.643Z"), "location" : { "type" : "Point", "coordinates" : [ -73.41825120678124, 41.11954344761012 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 2, "b" : 0, "s" : "XYZ_2", "d" : ISODate("2065-11-12T14:57:25.761Z"), "e" : 2, "mixed" : 2, "l" : [ 2, 3 ], "m" : { "i" : 2 } }, "lastEditDate" : ISODate("2018-01-22T18:37:14.659Z") }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd41418") }, "creationDate" : ISODate("2018-01-22T18:37:13.644Z"), "location" : { "type" : "Point", "coordinates" : [ -73.37204671642996, 41.685165896010886 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 3, "b" : 1, "s" : "XYZ_3", "d" : ISODate("2036-05-22T06:29:17.274Z"), "f" : 3, "mixed" : "a", "l" : [ 3, 4 ], "m" : { "i" : 3 } }, "lastEditDate" : ISODate("2018-01-22T18:37:14.660Z") }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd41419") }, "creationDate" : ISODate("2018-01-22T18:37:13.645Z"), "location" : { "type" : "Point", "coordinates" : [ -73.53230022652237, 41.99925690167811 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 4, "b" : 0, "s" : "XYZ_4", "d" : ISODate("2040-09-27T12:39:26.191Z"), "e" : 4, "mixed" : 4, "l" : [ 4, 5 ], "m" : { "i" : 4 } }, "lastEditDate" : ISODate("2018-01-22T18:37:14.661Z") }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd4141a") }, "creationDate" : ISODate("2018-01-22T18:37:13.647Z"), "location" : { "type" : "Point", "coordinates" : [ -73.37406135644643, 41.20800859893658 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 5, "b" : 1, "s" : "XYZ_5", "d" : ISODate("2032-01-12T02:54:20.554Z"), "mixed" : 5, "l" : [ 5, 6 ], "m" : { "i" : 5 } } }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd4141b") }, "creationDate" : ISODate("2018-01-22T18:37:13.648Z"), "location" : { "type" : "Point", "coordinates" : [ -73.23532883469254, 41.82323919775319 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 6, "b" : 0, "s" : "XYZ_6", "d" : ISODate("2040-06-19T23:57:21.058Z"), "e" : 6, "f" : 6, "mixed" : "a", "l" : [ 6, 7 ], "m" : { "i" : 6 } } }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd4141c") }, "creationDate" : ISODate("2018-01-22T18:37:13.649Z"), "location" : { "type" : "Point", "coordinates" : [ -73.836244603408, 41.10642459488787 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 7, "b" : 1, "s" : "XYZ_7", "d" : ISODate("2023-06-04T05:10:46.560Z"), "mixed" : 7, "l" : [ 7, 8 ], "m" : { "i" : 7 } } }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd4141d") }, "creationDate" : ISODate("2018-01-22T18:37:13.650Z"), "location" : { "type" : "Point", "coordinates" : [ -73.25802811955722, 41.6346098603488 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 8, "b" : 0, "s" : "XYZ_8", "d" : ISODate("2045-12-15T12:08:01.250Z"), "e" : 8, "mixed" : 8, "l" : [ 8, 9 ], "m" : { "i" : 8 } } }
 
{ "_id" : { "b" : ObjectId("5a662f597f3b985a7cd41414"), "d" : ObjectId("5a662f597f3b985a7cd4141e") }, "creationDate" : ISODate("2018-01-22T18:37:13.651Z"), "location" : { "type" : "Point", "coordinates" : [ -73.17641662154917, 41.15470847481475 ] }, "creatorId" : "i000000000000000000000001", "aZUQ" : { "a" : 9, "b" : 1, "s" : "XYZ_9", "d" : ISODate("2060-01-04T02:18:06.486Z"), "f" : 9, "mixed" : "a", "l" : [ 9, 10 ], "m" : { "i" : 9 } } }

Indexes:

> db.drops.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "drops.drops"
	},
	{
		"v" : 2,
		"key" : {
			"location" : "2dsphere"
		},
		"name" : "location_2dsphere",
		"ns" : "drops.drops",
		"background" : true,
		"2dsphereIndexVersion" : 3
	},
	{
		"v" : 2,
		"key" : {
			"_id.b" : 1,
			"_id.d" : 1
		},
		"name" : "_id.b_1__id.d_1",
		"ns" : "drops.drops",
		"background" : true
	},
	{
		"v" : 2,
		"key" : {
			"_id.d" : 1
		},
		"name" : "_id.d_1",
		"ns" : "drops.drops",
		"background" : true
	},
	{
		"v" : 2,
		"key" : {
			"_id.b" : 1,
			"creationDate" : -1
		},
		"name" : "_id.b_1_creationDate_-1",
		"ns" : "drops.drops",
		"background" : true
	},
	{
		"v" : 2,
		"key" : {
			"_id.b" : 1,
			"fhash" : 1
		},
		"name" : "_id.b_1_fhash_1",
		"ns" : "drops.drops",
		"background" : true,
		"sparse" : true
	}
]

Aggregation

i=[{ "$match" : { "aZUQ.a" : { "$gt" : 1} , "_id.b" : ObjectId("5a662f597f3b985a7cd41414")}},
{ "$project" : { "a" : "$aZUQ.a" , "month_creationDate" : { "$month" : "$creationDate"}}},
{ "$group" : { "_id" : { "a" : "$a"} , "a" : { "$addToSet" : "$a"} , "month" : { "$addToSet" : "$month_creationDate"}}},
{ "$sort" : { "month" : 1 , "a" : 1}},
{ "$limit" : 100}]

db.drops.aggregate([i[0], i[1], i[2]])

{ "_id" : { "a" : 9 }, "a" : [ 9 ], "month" : [ 1 ] },
{ "_id" : { "a" : 8 }, "a" : [ 8 ], "month" : [ 1 ] },
{ "_id" : { "a" : 7 }, "a" : [ 7 ], "month" : [ 1 ] },
{ "_id" : { "a" : 6 }, "a" : [ 6 ], "month" : [ 1 ] },
{ "_id" : { "a" : 4 }, "a" : [ 4 ], "month" : [ 1 ] },
{ "_id" : { "a" : 3 }, "a" : [ 3 ], "month" : [ 1 ] },
{ "_id" : { "a" : 5 }, "a" : [ 5 ], "month" : [ 1 ] },
{ "_id" : { "a" : 2 }, "a" : [ 2 ], "month" : [ 1 ] }

Adding the sort line will fail:

db.drops.aggregate([i[0], i[1], i[2],i[3]])`
>
2018-01-22T13:38:07.220-0500 E QUERY    [thread1] Error: command failed: {
	"ok" : 0,
	"errmsg" : "cannot sort with keys that are parallel arrays",
	"code" : 2,
	"codeName" : "BadValue"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:403:5
DB.prototype._runAggregate@src/mongo/shell/db.js:260:9
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1212:12
@(shell):1:1

Participants:

 Description   
Original Summary

Aggregation sorting error in 3.5, 3.6. No error in 3.4

Original Description

We have some generated aggregation queries. I've verified that our tests fail with 3.5, 3.6 and 3.7 server (testing on OSX, single instance environment) but succeed with 3.4 (and I'm assuming earlier versions since we were on 3.2 and this has never caused us a problem).The queries are being generated from the java driver but fail identically in the shell.

This happens in an aggregation sort step. Given an aggregation whose pipeline steps return the following:

 { "_id" : { "a" : 2 }, "a" : [ 2 ], "month" : [ 1 ] }
 { "_id" : { "a" : 3 }, "a" : [ 3 ], "month" : [ 1 ] }
 { "_id" : { "a" : 4 }, "a" : [ 4 ], "month" : [ 1 ] }
 { "_id" : { "a" : 5 }, "a" : [ 5 ], "month" : [ 1 ] }
 { "_id" : { "a" : 6 }, "a" : [ 6 ], "month" : [ 1 ] }
 { "_id" : { "a" : 7 }, "a" : [ 7 ], "month" : [ 1 ] }
 { "_id" : { "a" : 8 }, "a" : [ 8 ], "month" : [ 1 ] }
 { "_id" : { "a" : 9 }, "a" : [ 9 ], "month" : [ 1 ] }

adding a step of the form {"$sort":{"month":1, "a":1}}

Throws the error:

2018-01-22T13:38:07.220-0500 E QUERY    [thread1] Error: command failed: {
	"ok" : 0,
	"errmsg" : "cannot sort with keys that are parallel arrays",
	"code" : 2,
	"codeName" : "BadValue"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:403:5
DB.prototype._runAggregate@src/mongo/shell/db.js:260:9
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1212:12
@(shell):1:1



 Comments   
Comment by Kelsey Schubert [ 24/Jan/18 ]

Hi sbenjamin,

Thanks very much for taking the time to report this issue and provide additional context. It's very helpful to better understand how this query was generated, and I can see how this issue would arise in a query-generation context. I'm sorry this change of behavior was not better communicated. To resolve this issue, I would suggest modifying this code to ensure that sorts do not use keys that contain multiple arrays.

Please feel free to watch SERVER-32859 and DOCS-11241 for updates and let us know if there is anything else we can do to help clarify this change of behavior.

Thanks again,
Kelsey

Comment by Steven Benjamin [ 24/Jan/18 ]

Hello Kelsey,
The query comes from some query-generation code I wrote that wraps the aggregation framework. Because of the way the wrapper was written, there's an output stage that pushes values, (the "$addToSet" in the grouping above) that then gets pushed to a (possible) sort step (or other steps). This makes it possible in some cases to generate the query in question, and the error shows up in some unit tests on the upgraded mongo version. The "$addToSet" step functions in that context as a sort of generic gather step - you might want the set, you might want to do further processing (e.g. sorting), but grabbing the step in a group is the most generic thing you'd want to do. It's not necessarily a question of running that particular query - obviously if you were writing an ad-hoc query there'd be easier (and more concise) ways of doing this.

I'm totally willing to accept that:

  • this results from some misunderstanding on my part about how the aggregation framework works
  • I'm trying to do something unreasonable.
  • there are probably other ways to do this
  • the behavior previously would have resulted in inconsistent behavior that I didn't notice

However, I filed the report because this represents a change in the server behavior. Docs and better error messages would definitely be helpful. Thanks.

Comment by Kelsey Schubert [ 23/Jan/18 ]

Hi sbenjamin,

Thanks for reporting this behavior and the clear reproduction steps. In SERVER-19402, we modified how sort orders are generated for array fields. A strong motivation for this change was to ensure consistent sort orders for all aggregation and find commands. As a result, we can now guarantee that index key provided sort order and in-memory provided sort order will match. Since a meaningful sort order for index keys containing multiple arrays cannot be generated, attempting to sort on multiple array fields, will produce the error you have observed.

I understand that this error message is not the clearest, and have opened SERVER-32859 to better communicate what is going on, as well as a DOCS-11241 to better describe this change of behavior between 3.6 and earlier versions.

However, I would like to better understand why you were using this sort in previous versions. My understanding is that this sort order would produce a somewhat arbitrary sort order. If that's the case, would sorting against another field accomplish the same requirement and function as a suitable workaround?

Thank you,
Kelsey

Comment by Steven Benjamin [ 22/Jan/18 ]

Apologies for the formatting. I have reposted in a clearer form in the attachement.

Generated at Thu Feb 08 04:31:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.