[SERVER-79629] Consider avoiding copying valueBlock at BlockToRowStage::prepareDeblock Created: 02/Aug/23  Updated: 09/Nov/23  Resolved: 24/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Task Priority: Major - P3
Reporter: Yoon Soo Kim Assignee: Ian Boros
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
Assigned Teams:
Query Integration
Backwards Compatibility: Fully Compatible
Participants:
Linked BF Score: 35

 Description   

Irina's comment:
I'm concerned about what it means for the queries without the intermediate block processing (or before we can enable it), for example a $group w/o a prior filter: we'll be copying the whole original blocks all the time, no?

Ian's comment:
IMO, we are leaning in the direction of copying slightly too much. Here's what I would do: -We get rid of the _blocks member -On getNext(), we call getValues() on the new block that the child stage placed in the slot. We store this in _deblockedValueRuns. Note that these values are views! But these views are valid until we call getNext() on the child or yield. We don't have to worry about calling getNext() on the child, since by then we'll be done with everything in _deblockedValueRuns, so the other situation we have to think about is yielding... -On doSaveState(), we call copyValue() on everything in _deblockedValueRuns[_curIdx...end]. We should probably also reset _deblockedValueRuns[0..._curIdx] to Nothing so that no one reads it by accident. -On doRestoreState(), we do nothing.

This way, there is no need to clone the block, and the copy is only done for the values we haven't returned yet when there is a yield. Since yielding is rare, we will only copy a small fraction of the data.

Yoonsoo's reply:
"for now", I'd like to go with the current design to simplify the ownership model.

If I follow the Ian's proposal, I need to maintain whether each deblocked value is owned or not. That complicates the ownership model a bit more. At first, it's merely a view and after yielding, it owns values? Either being a view or a value owner will be easier to understand, I think. For now, we agreed on simplifying ownership model for TsBlock, always copying values when deblocking, right? There we simplify the ownership model, here we complicates the ownership model? That does not make much sense to me for now. Instead, let's measure perf and improve the design/code based on perf results. Are you concerned that the improvement will be too hard to achieve? If so, could you explain a little bit how that would be the case?

FYI: valueBlock->clone() will copy underlying buffer only, not deblocked values for bucket unpacking scenario. And I imagined that the lock yielding is much more frequent than we may think because each bucket may have up to 1000 measurements and we also yield lock per each 1000 iteration. So, copying deblocked values may be much more frequent than we currently think. Which one between copying the underlying buffer and copying deblocked values partially is more frequent and more expensive? This was my thought process for the current design.



 Comments   
Comment by Githook User [ 24/Oct/23 ]

Author:

{'name': 'Ian Boros', 'email': 'ian.boros@mongodb.com', 'username': 'borosaurus'}

Message: SERVER-79629 Avoid unconditional copy in block_to_row SBE stage
Branch: master
https://github.com/mongodb/mongo/commit/e716d715322eb40fc13c4fac6bcc3e28f1c7ffe6

Generated at Thu Feb 08 06:41:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.