[SERVER-37398] Concurrent aggregations with $out on a shard server can deadlock with each other Created: 28/Sep/18  Updated: 29/Oct/23  Resolved: 14/Nov/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 4.1.3
Fix Version/s: 4.1.6

Type: Bug Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2018-11-05, Query 2018-11-19
Participants:
Linked BF Score: 45

 Description   

A single aggregation request with $out takes:

  • the global lock in IS for a collection lock on the input collection from the client's OperationContext
  • the global lock in IS for a collection lock on config.collections and config.chunks from the ShardServerCatalogCacheLoader from one of the loader's OperationContexts
  • the global lock in X for renameIfOptionsAndIndexesHaveNotChanged() from the client's OperationContext

Enqueuing a global X lock request blocks all later global lock requests.

So, if one agg request's global X lock request is enqueued in between a second agg request's two global IS lock requests, the three threads can deadlock.

I think it also may be possible for any operation that takes a global X lock to deadlock with a single agg with $out? I am not totally sure though, which is why I titled the ticket as I did for now.

I am not sure whether this affects versions before master, so have just marked it as affecting 4.1.3 for now.



 Comments   
Comment by Githook User [ 14/Nov/18 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-37398 Prove $out can no longer deadlock
Branch: master
https://github.com/mongodb/mongo/commit/dd2b307956c44340d46fe05eabf5bbd08f96186c

Comment by Charlie Swanson [ 07/Nov/18 ]

I'm going to remove the 'is depended on by' links and move them to 'related to links'. BF-10750 and BF-10722 should have been resolved by SERVER-36813.

Comment by Charlie Swanson [ 07/Nov/18 ]

Update: looks like SERVER-36813 removed the deadlock scenario (mostly by accident).

Generated at Thu Feb 08 04:45:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.