-
Type:
Improvement
-
Resolution: Works as Designed
-
Priority:
Minor - P4
-
None
-
Affects Version/s: 4.4.2
-
Component/s: Aggregation Framework
-
Query Optimization 2021-05-03, Query Optimization 2021-05-17
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I have an array field where I need to replace consecutive values by null, e.g.
{ x: [22, 22, 80, 80, 80, 80, 80, 443, 443, 443, 5223, 5224] }
should be translated to:
{ x: [22, null, 80, null, null, null, null, 443, null, null, 5223, 5224] }
I developed two aggregation operations:
{
$set: {
x: {
$cond: {
if: { $isArray: "$x" },
then: {
$map: {
input: { $range: [0, { $size: "$x" }] },
as: "idx",
in: {
$let: {
vars: {
this: { $arrayElemAt: ["$x", "$$idx"] },
prev: { $arrayElemAt: ["$x", { $subtract: ["$$idx", 1] }] }
},
in: {
$cond: {
if: { $and: [{ $eq: ["$$this", "$$prev"] }, { $gt: ["$$idx", 0] }] },
then: null,
else: "$$this"
}
}
}
}
}
},
else: "$x"
}
}
}
}
and
{
$set: {
x: {
$cond: {
if: { $isArray: "$x" },
then: {
$let: {
vars: {
values: {
$reduce: {
input: "$x",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[{
val: "$$this",
new: {
$cond: {
if: { $ne: ["$$this", { $last: "$$value.val" }] },
then: "$$this",
else: null
}
}
}]
]
}
}
}
},
in: "$$values.new"
}
},
else: "$x"
}
}
}
}
Both of them are working fine, however the firsts version with `$map: { input: { $range: ...` is around 10 times faster than the second version with `$reduce`.
Is there a reason why `$reduce` is 10 times slower than `$map`? I would expect the second version little faster because array "$x" is read only once. Is there any possibility to improve it?
My collection contains several million documents, the array size vary from 2 to 150k with an average of 50 elements.
Kind Regards
Wernfried