[SERVER-35361] filemd5 command fails to safely clean up PlanExecutor after manual yield Created: 01/Jun/18  Updated: 29/Oct/23  Resolved: 22/Jun/18

Status: Closed
Project: Core Server
Component/s: GridFS
Affects Version/s: None
Fix Version/s: 4.0.1, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Query 2018-07-02
Participants:
Linked BF Score: 60

 Description   

The command creates a PlanExecutor here, and in order to destroy a PlanExecutor, a collection lock must be held in MODE_IS. If the command exits via an exception while yielded, this will not be true.

For example, if this AutoGetCollectionForReadCommand throws an exception due to interrupt from a step-down, the PlanExecutor will be destroyed without holding a lock.

While being destroyed, the PlanExecutor needs to communicate to the collection's cursor manager that it is being deleted, so doing so without a lock can lead to a race, as described in SERVER-25694.



 Comments   
Comment by Githook User [ 29/Jun/18 ]

Author:

{'username': 'cswanson310', 'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-35361 Ensure collection is locked when deleting filemd5's PlanExecutor

(cherry picked from commit ab1f620bc37d174b43a11f40fb30c3b4b31584d1)
Branch: v4.0
https://github.com/mongodb/mongo/commit/669337462d7d2be1e61060394900f58974b2a544

Comment by Githook User [ 20/Jun/18 ]

Author:

{'username': 'cswanson310', 'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com'}

Message: SERVER-35361 Ensure collection is locked when deleting filemd5's PlanExecutor
Branch: master
https://github.com/mongodb/mongo/commit/ab1f620bc37d174b43a11f40fb30c3b4b31584d1

Comment by Charlie Swanson [ 01/Jun/18 ]

We believe we can fix this by simply removing the manual yield behavior and holding a (intent) lock the whole time. Particularly on master, this should not have any performance impact on document-locking storage engines.

 

As part of this work, we should audit other usages of YIELD_MANUAL to see if they are susceptible to a similar problem.

Generated at Thu Feb 08 04:39:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.