[SERVER-66606] Full statistics pipeline Created: 19/May/22  Updated: 29/Oct/23  Resolved: 21/Nov/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Task Priority: Major - P3
Reporter: Joel Redman (Inactive) Assignee: Alya Berciu
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-69782 In CQF we erroneously match documents Closed
depends on SERVER-67506 [CQF] Dotted path equality to null in... Open
Duplicate
is duplicated by SERVER-71303 Estimate nested arrays in max_diff code Closed
Backwards Compatibility: Fully Compatible
Sprint: QO 2022-06-27, QO 2022-07-11, QO 2022-07-25, QO 2022-08-08, QO 2022-08-22, QO 2022-09-05, QO 2022-09-19, QO 2022-10-03, QE 2022-10-17, QE 2022-10-31, QE 2022-11-14
Participants:

 Description   

Generate an internal command for statistics analysis which runs the following pipeline (assuming path "a.b.c"):

 

db.coll.aggregate( [
{ $project: {val : path,
     hasMissing : { $cond: [{ $isArray : path },
                  {$anyElementTrue : { $map: {input:"$a.b", in: {$eq: [{$type: "$$this.c"{color}}, "missing"]}}}},
                  {$eq: [{$type: path}, "missing"]} ]}
}},
{ $addFields: {isArray : {$isArray : "$val"}}},
{ $unwind: {path: "$val", preserveNullAndEmptyArrays : true}},
{ $sort : {val : 1}},
{ $_analyzeInternal: {} }
]);

 

Note the oddity with the prefix path is only necessary if such a prefix exists so there will need to be some logic around this. The idea here is that we want to project the path, noting whether there were any subdocuments within arrays that had missing elements. We then add a field to tell us whether the values are contained in an array, then unwind and sort, preserving null, missing, and empty arrays.

$_analyzeInternal will be implemented in another ticket. If the integration timing works our poorly, this can either be removed or replaced with something that would generate an empty document.



 Comments   
Comment by Githook User [ 21/Nov/22 ]

Author:

{'name': 'Alya Berciu', 'email': 'alya.berciu@mongodb.com', 'username': 'alyacb'}

Message: SERVER-66606 Count types in stats for CE
Branch: master
https://github.com/mongodb/mongo/commit/5942345702bad4542a523a36885c3fcafa02bded

Comment by Sam Mercier [ 07/Jul/22 ]

Note: we're leaning away from using the DS's and instead we're going to try to write custom SBE.

Generated at Thu Feb 08 06:05:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.