Details
-
Task
-
Status: Closed
-
Major - P3
-
Resolution: Done
-
None
-
None
-
3
-
ServerDocs2022: Jun14 - Jun28
-
This is a minor change to the results of a distinct when run on a collection with a view, and the collection contains arrays at the field of interest. We are planning to backport this to 6.0.
Description
This is a minor change to the results of a distinct when run on a collection with a view, and the collection contains arrays at the field of interest. We are planning to backport this to 6.0.
Description of Linked Ticket
SERVER-27644 and SERVER-40134 already exposed differences in behaviour between running a distinct command on a collection vs. running it on a view. The aggregation pipeline created internally to run distinct on a view does not cover all the use-cases.
While investigating the behaviour as part of DRIVERS-1486 I found some more differences.
To test, I've created a collection with data and created a view based on an empty pipeline:
getDocument = function (id, value) {
|
return {
|
_id: id,
|
x: value,
|
nested: {
|
x: value,
|
nested: {
|
x: value
|
},
|
array: [value]
|
},
|
array: [value],
|
documentArray: [
|
{x: value},
|
{x: value << 2}
|
]
|
};
|
}
|
|
db.distinctTest.drop();
|
db.distinctTest.insertMany([
|
getDocument(1, 1),
|
getDocument(2, 2),
|
getDocument(3, 3),
|
getDocument(4, 1)
|
]);
|
db.createView('distinctViewTest', 'distinctTest', []);
|
I've come up with the following calls to distinct:
db.distinctTest.distinct('x');
|
db.distinctTest.distinct('nested.x');
|
db.distinctTest.distinct('nested.nested.x');
|
db.distinctTest.distinct('array');
|
db.distinctTest.distinct('nested.array');
|
db.distinctTest.distinct('documentArray');
|
db.distinctTest.distinct('documentArray.x');
|
db.distinctTest.distinct('documentArray[1].x');
|
db.distinctTest.distinct('documentArray.1.x');
|
Most of the cases look the same (except for different result ordering which we can ignore), but the last case differs:
MongoDB Enterprise > db.distinctTest.distinct('documentArray.1.x');
|
[ 4, 8, 12 ]
|
MongoDB Enterprise > db.distinctViewTest.distinct('documentArray.1.x');
|
[ ]
|
Looking at the pipeline generated, we can see that documentArray.1.x produces three $unwind stages:
[
|
{
|
"$unwind" : {
|
"path" : "$documentArray",
|
"preserveNullAndEmptyArrays" : true
|
}
|
},
|
{
|
"$unwind" : {
|
"path" : "$documentArray.1",
|
"preserveNullAndEmptyArrays" : true
|
}
|
},
|
{
|
"$unwind" : {
|
"path" : "$documentArray.1.x",
|
"preserveNullAndEmptyArrays" : true
|
}
|
},
|
{
|
"$match" : {
|
"documentArray" : {
|
"$_internalSchemaType" : "object"
|
},
|
"documentArray.1" : {
|
"$_internalSchemaType" : "object"
|
}
|
}
|
},
|
{
|
"$group" : {
|
"_id" : null,
|
"distinct" : {
|
"$addToSet" : "$documentArray.1.x"
|
}
|
}
|
}
|
]
|
This is incorrect, as documentArray.1 should not unwind documentArray first, but rather use $arrayElemAt to. This modified aggregation pipeline produces the same result as the corresponding distinct command:
[
|
{
|
"$set": {
|
"documentArray": { $arrayElemAt: [ "$documentArray", 1 ] }
|
}
|
},
|
{
|
"$unwind" : {
|
"path" : "$documentArray",
|
"preserveNullAndEmptyArrays" : true
|
}
|
},
|
{
|
"$unwind" : {
|
"path" : "$documentArray.x",
|
"preserveNullAndEmptyArrays" : true
|
}
|
},
|
{
|
"$match" : {
|
"documentArray" : {
|
"$_internalSchemaType" : "object"
|
}
|
}
|
},
|
{
|
"$group" : {
|
"_id" : null,
|
"distinct" : {
|
"$addToSet" : "$documentArray.x"
|
}
|
}
|
}
|
]
|
I was able to reproduce this in 4.2.12, 4.4.3, and 4.9.0-alpha4. It is likely that this also affects previous versions which I didn't have on hand to test. SERVER-27644 introduced the $unwind logic and was backported to 3.4, so I expect all versions starting with that being affected.
Attachments
Issue Links
- backported by
-
DOCS-15317 [BACKPORT] [v6.0] Behaviour of distinct differs between collections and views
-
- Closed
-
- documents
-
SERVER-55112 Behaviour of distinct differs between collections and views
-
- Closed
-