[SERVER-78887] Support yield/unyield for TsBucketToCellBlockStage / ValueBlock Created: 12/Jul/23 Updated: 29/Oct/23 Resolved: 03/Aug/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Yoon Soo Kim | Assignee: | Yoon Soo Kim |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Sprint: | QI 2023-07-24, QI 2023-08-07 |
| Participants: |
| Comments |
| Comment by Githook User [ 03/Aug/23 ] | |||
|
Author: {'name': 'Yoonsoo Kim', 'email': 'yoonsoo.kim@mongodb.com', 'username': 'yun-soo'}Message: | |||
| Comment by Githook User [ 03/Aug/23 ] | |||
|
Author: {'name': 'Yoonsoo Kim', 'email': 'yoonsoo.kim@mongodb.com', 'username': 'yun-soo'}Message: | |||
| Comment by Ian Boros [ 12/Jul/23 ] | |||
|
Copy-pasting my PR comment: I forgot to mention, this stage will interact with yielding/getNexts in a fun way. Imagine I have the following plan:
I can run this plan, pull a bucket out of storage, return some data through block_to_row, and then fill up a getMore batch midway through processing one bucket. This means the underlying BSON will disappear, and the cursor we opened on the ValueBlock may become invalidated if it's pointing to storage-owned memory. The way I was planning to solve this actually used the extract() API. Every time block_to_row gets a new bucket, we extract() it, which returns a vector of unowned tag/values. We remember our index into that vector, and return the i'th one each call to getNext(). When a yield occurs, we call value::copy() on values i through the last one and proceed as normal. There are other options, like complicating Cursor but IMO this one is pretty straightforward, even if it means adding extract() back for this patch. Other thoughts? |