[SERVER-78887] Support yield/unyield for TsBucketToCellBlockStage / ValueBlock Created: 12/Jul/23  Updated: 29/Oct/23  Resolved: 03/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Yoon Soo Kim Assignee: Yoon Soo Kim
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: QI 2023-07-24, QI 2023-08-07
Participants:

 Comments   
Comment by Githook User [ 03/Aug/23 ]

Author:

{'name': 'Yoonsoo Kim', 'email': 'yoonsoo.kim@mongodb.com', 'username': 'yun-soo'}

Message: SERVER-78887 Support yielding for TsBucketToCellBlock and BlockToRow stages
Branch: minh.luu-no_compile_sys-perf
https://github.com/mongodb/mongo/commit/00d62e3c49ea299e68b9178fc5c8b3b6e4a38b52

Comment by Githook User [ 03/Aug/23 ]

Author:

{'name': 'Yoonsoo Kim', 'email': 'yoonsoo.kim@mongodb.com', 'username': 'yun-soo'}

Message: SERVER-78887 Support yielding for TsBucketToCellBlock and BlockToRow stages
Branch: master
https://github.com/mongodb/mongo/commit/00d62e3c49ea299e68b9178fc5c8b3b6e4a38b52

Comment by Ian Boros [ 12/Jul/23 ]

Copy-pasting my PR comment:

I forgot to mention, this stage will interact with yielding/getNexts in a fun way. Imagine I have the following plan:

block_to_row [s1 -> s2]
ts_bucket_to_cellblock a=s1
scan ...

I can run this plan, pull a bucket out of storage, return some data through block_to_row, and then fill up a getMore batch midway through processing one bucket. This means the underlying BSON will disappear, and the cursor we opened on the ValueBlock may become invalidated if it's pointing to storage-owned memory.

The way I was planning to solve this actually used the extract() API. Every time block_to_row gets a new bucket, we extract() it, which returns a vector of unowned tag/values. We remember our index into that vector, and return the i'th one each call to getNext(). When a yield occurs, we call value::copy() on values i through the last one and proceed as normal.

There are other options, like complicating Cursor but IMO this one is pretty straightforward, even if it means adding extract() back for this patch. Other thoughts?

Generated at Thu Feb 08 06:39:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.