[SERVER-69893] Add the ability to acquire lock manager mutex resources by name Created: 22/Sep/22  Updated: 14/Nov/23  Resolved: 14/Nov/23

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Kaloian Manassiev
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-69435 Make the CSS acquisition a RAII Closed
Related
is related to SERVER-70322 BalancerStatsRegistry inappropriately... Closed
is related to SERVER-70431 Checkpointing manually constructs Res... Closed
is related to SERVER-78229 createIndexes should acquire the Coll... Closed
is related to SERVER-70198 Use one single RESOURCE_MUTEX to prot... Closed
is related to SERVER-70467 Prevent manually using RESOURCE_MUTEX Closed
Sprint: Sharding EMEA 2022-10-03, Sharding EMEA 2022-10-17, Sharding EMEA 2022-10-31, Execution EMEA Team 2023-07-24
Participants:

 Description   

In Sharding we have a requirement to be able to lock some object by name (where an object is database or collection) for short period of time, while we perform some in-memory modifications on that object.

These are much lighterweight locks than our current DB/Collection locks in that we will never do blocking operations under than and also they do not require 2-phase locking.

The current RESOURCE_MUTEX is unsafe to be acquired with a manually constructed string that contains the DB or Collection name in it, because the hash to string resolution relies on the hash mapping to a string name.

I am realising now that this is not correct, because RESOURCE_MUTEX's implementation (and its logging specifically) assumes that the ResourceMutex class is in control of generating the ResourceIds, rather than just taking the hash of the string that's passed as the resource name. Thus, right now these usages violate the implicit contract of RESOURCE_MUTEX.

Given the requirements above, we still need to be able to lock something by name and we don't care about hash collisions.

For this purpose I would like to extend the lock manager with the following resource types with more stricter semantics:

  • TopResourceMutex: Not allowed to be locked by random string and the lock manager controls what hashes get assigned to each one; the only way to use it through a TopLevelResourceMutex class; can only be the first lock acquired; The use case is to disallow for example two instances of the same command to run concurrently;
  • ResourceMutex: Same as the current resource mutex; cannot do blocking operations while holding it;
  • NamedLock: Allowed to be locked by random string; Not interruptible and no Global/DB/Collection locks can be taken after it (i.e., it must be at the bottom of the hierarchy)


 Comments   
Comment by Kaloian Manassiev [ 14/Nov/23 ]

In the end we haven't had a pressing need for this change and it would be a net improvement if we just started using the database/collection locks in smaller scopes (i.e., not for the duration of an entire DDL or CRUD operation, but just for committing).

Closing as Won't Fix.

Comment by Kaloian Manassiev [ 21/Jun/23 ]

I would like to reopen the discussion around this proposal here where had we at least done the part with two-level resource mutexes we would have caught deadlocks like the one in SERVER-78229.

We don't necessarily need to do the boost::optional improvements that are in the PR, but at least the two-level mutexes.

Comment by Jordi Serra Torrens [ 24/Oct/22 ]

SERVER-70610 took a different approach where we use a "normal" RESOURCE_MUTEX for each database, so we don't need named resourceMutexes now.

Generated at Thu Feb 08 06:14:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.