-
Type: Task
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Labels:None
Description
When running an explain of an aggregation pipeline containing $sort or $group when the verbosity is executionStats or above, the output will contain some extra fields exposing the amount of data processed.
The new fields are, for each stage:
$sort:
- totalDataSizeSortedBytesEstimate (in bytes)
- usedDisk (boolean)
$group:
- totalDataSizeGroupedBytesEstimate (in bytes)
- usedDisk (boolean)
Description of Linked Ticket
SERVER-21784 recently added execution stats to the agg execution layer, and exposed them via "executionStats" or "allPlansExecution" explain verbosities. This ticket, however, added only nReturned and executionTimeMillis for every stage. There are more stats that we can expose which will be useful for debugging and performance investigations.
One suggestion from alex.bevilacqua is to expose the amount of data processed by $sort or $group. We have such stats for sorts executed in the PlanStage layer, but not for sorts executed in the DocumentSource layer. The $sort stage would report a totalDataSizeSorted metric, and the $group stage would report totalDataSizeGrouped.
Another idea that we could consider implementing at the same time is to report usedDisk:true when either a $sort or a $group spills to disk at runtime.
Scope of changes
Impact to Other Docs
MVP (Work and Date)
Resources (Scope or Design Docs, Invision, etc.)
- documents
-
SERVER-48380 Expose total data size in bytes processed by $sort and $group in agg execution stats explain
- Closed