[SERVER-70338] Query yield accesses the storage engine without locks during shutdown and rollback Created: 07/Oct/22  Updated: 13/Dec/23  Resolved: 25/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0, 7.0.5, 6.0.13

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-74809 Yield call into storage engine after ... Closed
is duplicated by SERVER-79761 GlobalLock can segfault due to not ac... Closed
Problem/Incident
Related
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.0, v6.0, v5.0, v4.4
Sprint: Execution Team 2022-11-14, Execution Team 2023-04-03, Execution Team 2023-04-17, Execution Team 2023-05-01
Participants:
Linked BF Score: 103

 Description   

The yielding code in PlanYieldPolicy does the following:

  • Releases its locks
  • Rolls-back the storage transaction via abandonSnapshot()
  • Re-acquires its locks

The global lock synchronizes access to the storage engine for shutdown and rollback. This order of operations can result in the operation unsafely accessing the storage engine during these periods.

This has existed since at least 3.6.



 Comments   
Comment by Githook User [ 13/Dec/23 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-70338 Abandon snapshot while holding locks during query yield

  • Alters the Locker contract to require callers to check if they can yield locks before yielding
  • Reverses the ordering of yielding such that the snapshot is released before releasing locks
  • Refactors the PlanYieldPolicy to allow it to override the requested YieldPolicy if necessary

(cherry picked from commit 55877fcfb5e8ac0b23f65862cd1d2f9b439c07f6)
(cherry picked from commit ee30bfbb242d32fb79cb1d309a099d6cf6099329)

GitOrigin-RevId: 5ef84cf4a6616131694bc4c269d83c05fd570648
Branch: v6.0
https://github.com/mongodb/mongo/commit/f020d8e8e10464b95720defd8b56b8a4d53e0af8

Comment by Githook User [ 11/Dec/23 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-70338 Abandon snapshot while holding locks during query yield

  • Alters the Locker contract to require callers to check if they can yield locks before yielding
  • Reverses the ordering of yielding such that the snapshot is released before releasing locks
  • Refactors the PlanYieldPolicy to allow it to override the requested YieldPolicy if necessary

(cherry picked from commit 55877fcfb5e8ac0b23f65862cd1d2f9b439c07f6)

GitOrigin-RevId: 1854bf2ace299e22995bf8d7e64195107f703029
Branch: v7.0
https://github.com/mongodb/mongo/commit/7c1cd6a06c42b15fc35d85f512a7e166bf9bffcf

Comment by Githook User [ 25/Apr/23 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-70338 Abandon snapshot while holding locks during query yield

Comment by Jordi Olivares Provencio [ 09/Nov/22 ]

The rollback path is also safe. Rollback returns an EBUSY error if there are any active cursors or transactions. That is, it will only succeed if there are no active users in WT. This error gets retried in 6.0+ as of SERVER-63989.

Comment by Jordi Olivares Provencio [ 31/Oct/22 ]

Right now the shutdown path is safe. Sessions will all get killed and drained before shutting down the storage engine.

This happens here.

Generated at Thu Feb 08 06:15:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.