[SERVER-79629] Consider avoiding copying valueBlock at BlockToRowStage::prepareDeblock Created: 02/Aug/23 Updated: 09/Nov/23 Resolved: 24/Oct/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.2.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Yoon Soo Kim | Assignee: | Ian Boros |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Query Integration
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Participants: | |||||
| Linked BF Score: | 35 | ||||
| Description |
|
Irina's comment: Ian's comment: This way, there is no need to clone the block, and the copy is only done for the values we haven't returned yet when there is a yield. Since yielding is rare, we will only copy a small fraction of the data. Yoonsoo's reply: If I follow the Ian's proposal, I need to maintain whether each deblocked value is owned or not. That complicates the ownership model a bit more. At first, it's merely a view and after yielding, it owns values? Either being a view or a value owner will be easier to understand, I think. For now, we agreed on simplifying ownership model for TsBlock, always copying values when deblocking, right? There we simplify the ownership model, here we complicates the ownership model? That does not make much sense to me for now. Instead, let's measure perf and improve the design/code based on perf results. Are you concerned that the improvement will be too hard to achieve? If so, could you explain a little bit how that would be the case? FYI: valueBlock->clone() will copy underlying buffer only, not deblocked values for bucket unpacking scenario. And I imagined that the lock yielding is much more frequent than we may think because each bucket may have up to 1000 measurements and we also yield lock per each 1000 iteration. So, copying deblocked values may be much more frequent than we currently think. Which one between copying the underlying buffer and copying deblocked values partially is more frequent and more expensive? This was my thought process for the current design. |
| Comments |
| Comment by Githook User [ 24/Oct/23 ] |
|
Author: {'name': 'Ian Boros', 'email': 'ian.boros@mongodb.com', 'username': 'borosaurus'}Message: |