[SERVER-55498] Prevent SBE from using unowned values from storage after yield Created: 24/Mar/21  Updated: 29/Oct/23  Resolved: 30/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Nikita Lapkov (Inactive) Assignee: Martin Neupauer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query Execution 2021-04-05, Query Execution 2021-04-19, Query Execution 2021-05-03
Participants:

 Description   

Currently, values read in SBE's scan stage are owned by storage engine. There could be a situation when the query reads the document from storage, and in the middle of processing the document, yield happens. Yield can lead to freeing up memory behind the document. If that happens, SBE will try to process invalid document.

For example, consider the plan

unwind arrayToIterate = arrayField
project arrayField = getField(inputDocument, "a")
scan readInto = inputDocument

and input document

{a: [1, 2, 3]}

. Yield can happen after unwind returns ADVANCED, processing element 1, potentially freeing document stored in inputDocument slot and invalidating array in arrayField.

Classic engine handles this case by making pessimistic copies.

After talking with anton.korshunov, martin.neupauer and david.storch, we have come up with a different approach. ScanStage::save() should make a copy of a document read from storage. restore() method of all stages which use values from stages below should check if these values are still the same. For instance, UnwindStage::restore() should check if getViewOfValue() for the array is equal to the value it traverses. If not, stage should update its internal state to use new value. Project stage should always recompute all expressions, since they can use slot with changed values.

We should make a quick WIP patch to test this approach. If we encounter any major problems, we can switch to making pessimistic copies for the time being.



 Comments   
Comment by Githook User [ 29/Apr/21 ]

Author:

{'name': 'Martin Neupauer', 'email': 'xmaton@messengeruser.com'}

Message: SERVER-55498 Prevent SBE from using unowned values from storage after
yield

This change ensures that accesors produce valid values after yield.
Branch: master
https://github.com/mongodb/mongo/commit/a1dd77a37ab054564904ce8d0f04dacca04a9d2a

Comment by Githook User [ 19/Apr/21 ]

Author:

{'name': 'Martin Neupauer', 'email': 'xmaton@messengeruser.com'}

Message: SERVER-55498 Prevent SBE from using unowned values from storage after
yield

Another patch in series of patches to fix up dangling values.
This switches the merge streams to the switched accessor sidestepping
the problem completely.
Branch: master
https://github.com/mongodb/mongo/commit/e840e9f3ea2bd1517489b3bfbfee280efe9b392e

Comment by Githook User [ 15/Apr/21 ]

Author:

{'name': 'Martin Neupauer', 'email': 'xmaton@messengeruser.com'}

Message: SERVER-55498 Prevent SBE from using unowned values from storage after
yield

This is a part of the multi commit approach. Here we simplify
BranchStage and UnionStage so they are immune to the problem with
unowned values.
Branch: master
https://github.com/mongodb/mongo/commit/71afd01ed1cc63f719a51b3f0aca844b4393f8ac

Generated at Thu Feb 08 05:36:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.