[SERVER-30932] dbCheck command violates lock ordering by acquiring lock on "local" database first Created: 02/Sep/17  Updated: 09/Dec/21  Resolved: 09/Dec/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Josef Ahmad
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-31070 make dbCheck a test command Closed
is related to SERVER-61748 dbCheck should not hold a strong data... Closed
Sprint: Execution Team 2021-12-13
Participants:
Linked BF Score: 0

 Description   

The AutoGetDbForDbCheck RAII class first attempts to acquire a lock on the "local" database in MODE_IX and then attempts to acquire a lock on the database to check in MODE_S. This is incompatible with the lock ordering that other database operations use when calling repl::logOp() because other threads will first attempt to acquire a lock on the database and then attempt to acquire a lock on the "local" database.

AutoGetDbForDbCheck::AutoGetDbForDbCheck(OperationContext* opCtx, const NamespaceString& nss)
    : localLock(opCtx, "local"_sd, MODE_IX), agd(opCtx, nss.db(), MODE_S) {}



 Comments   
Comment by Josef Ahmad [ 09/Dec/21 ]

This issue was resolved in SERVER-61748 which removed the local database lock acquisition.

Comment by Max Hirschhorn [ 08/Sep/17 ]

geert.bosch and I discussed this issue in-person - I think we're on the same page now that the locking order in the "dbCheck" thread is incorrect. It sounds like this locking order was chosen to avoid an issue when using the MMAPv1 storage engine since the MMAPv1 flush lock is taken the same mode as the first global lock acquisition. We'll likely need to acquire the global lock in an intent mode explicitly in addition to swapping the lock order to work around both issues.

Comment by Ian Whalen (Inactive) [ 05/Sep/17 ]

Leaving in Triage for now. Geert to talk to Max about it.

Generated at Thu Feb 08 04:25:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.