[SERVER-81331] Spilling in SBE may lead to read on destroyed catalog object Created: 22/Sep/23  Updated: 29/Nov/23  Resolved: 17/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Ian Boros Assignee: Martin Neupauer
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-74133 Spilling to TemporaryRecordStores in ... Backlog
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QE 2023-10-02, QE 2023-10-16, QE 2023-10-30, QE 2023-11-13, QE 2023-11-27
Participants:
Case:
Linked BF Score: 129

 Description   

TLDR: SBE may read freed memory after spilling, when there are concurrent catalog-changing operations.

Description of Bug
SBE is running a query, reading at a particular snapshot. A blocking stage then decides to spill. The spill operation performs a write and commits it in a separate WriteUnitOfWork under the same RecoveryUnit. After this write commits, our snapshot may advance. As part of advancing the snapshot, the CollectionCatalog shared_ptr decorating decorating the current Snapshot is destroyed.

Today SBE does not handle this situation, and assumes that the CollectionCatalog remains the same after the write, as it keeps pointers to the current Collection in the CollectionPtr objects it stores.

In the (unlikely) case where the spilling query's RecoveryUnit holds the last reference to the CollectionCatalog, (or the case where another thread destroys the CollectionCatalog immediately after the spill write commits), attempting to call CollectionPtr::yield() will result in a use-after-free, as the CollectionPtr will try to read the uuid of a deleted Collection object.

How to Fix it

  • One option is to treat spills as "yield events." We save the plan's state before committing the spill write, and restore it afterwards. This way the plan is never holding onto pointers to a destroyed CollectionCatalog.
  • Another option is to spill in a separate storage transaction. SERVER-61116 and SERVER-74133 discusses why this requires a some delicacy. It is possible to deadlock when using multiple WT sessions from the same thread.
  • Don't spill to a temporary record store.
  • Anything else.

Notes
As louis.williams@mongodb.com discussed in SERVER-74133, there are other problems with spilling to a temporary RecordStore, so we may want to deal with both of these issues at the same time.



 Comments   
Comment by Steve Tarzia [ 17/Nov/23 ]

Note that a backport is not needed because the merge happened before the 7.2 branch cut.  The fix is in https://github.com/10gen/mongo/commits/v7.2/src/mongo/db/storage/recovery_unit.h

Comment by Githook User [ 02/Nov/23 ]

Author:

{'name': 'Martin Neupauer', 'email': 'xmaton@messengeruser.com', 'username': ''}

Message: SERVER-81331 Spilling
Branch: master
https://github.com/mongodb/mongo/commit/3003c2d9b532ddb2ba8f8825902dbce5881b89c1

Generated at Thu Feb 08 06:46:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.