[SERVER-86053] Optimize building BSON from timeseries data Created: 01/Feb/24  Updated: 01/Feb/24

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Alberto Massari Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File PERF-T~2.SVG    
Assigned Teams:
Query Execution
Participants:

 Description   

The match_1000_buckets_include_10_fields benchmark in the timeseries_stress_unpacking project runs a simple

[
{$match: {measurement: {$lte: 988000 }}}, 
{$project: {_id: 1, time: 1, measurement: 1, <other 10 fields>}}
]

that is translated as a plan

[3] project [s20 = makeBsonObj(MakeObjSpec([_id, time, measurement, p1, p2, p3], Closed, RetNothing), s19, false)] 
[2] mkbson s19 [_id = s13, measurement = s14, p1 = s15, p2 = s16, p3 = s17, time = s18] true false 
[2] block_to_row blocks[s5, s6, s7, s8, s9, s10] row[s13, s14, s15, s16, s17, s18] s12 
[2] filter {!(valueBlockNone(s12, true))} 
[2] project [s12 = cellFoldValues_F(valueBlockFillEmpty(valueBlockLteScalar(cellBlockGetFlatValuesBlock(s11), 9999L), false), s11)] 
[2] ts_bucket_to_cellblock s2 pathReqs[s5 = ProjectPath(Get(_id)/Id), s6 = ProjectPath(Get(measurement)/Id), s7 = ProjectPath(Get(p1)/Id), s8 = ProjectPath(Get(p2)/Id), s9 = ProjectPath(Get(p3)/Id), s10 = ProjectPath(Get(time)/Id), s11 = FilterPath(Get(measurement)/Traverse/Id)] 

The attached flame graph shows that the mkbson stage takes 16%, and the project calling makeBsonObj take another 22%. They seem to do the same operation twice: could the mkbson stage be removed if the processing of the TsUnblock node supported the MRInfo?


Generated at Thu Feb 08 06:59:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.