-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 6.0.2
-
Component/s: None
-
None
-
ALL
-
-
Sharding EMEA 2022-10-17, Sharding EMEA 2022-10-31, Sharding EMEA 2022-11-14
-
(copied to CRM)
This ticket is to investigate the lock timeouts caused by the range deleter on v6.0. The release acquisition of the critical section is timing out trying to acquire the collection lock that seems to be held by the range deleter.
This is a problem on 6.0 because of the acquisition of the ScopedRangeDeleterLock before running the deletion. This lock is necessary to prevent the FCV upgrade orphan counter code from setting incorrect counters on the range deletions. However, the problem we are running into here is that the ScopedRangeDeleterLock acquires the DBLock on the config database, which automatically acquires the GlobalLock. But we have already acquired the DBLock on the user database, which has already automatically acquired the GlobalLock. This double acquisition means that we are recursively locking the global lock, and so the yield policy is replaced with NO_YIELD.
We also cannot change the ScopedRangeDeleterLock to never acquire the global lock, because some usages of the persistUpdatedNumOrphans function during migrations are not holding the global lock already.
One option would be to change the ScopedRangeDeleterLock to acquire the GlobalLock conditionally based on whether we already hold it, something like replacing this line with the code below.
_configLock(opCtx, NamespaceString::kConfigDb, MODE_IX, Date_t::max(), opCtx->lockState()->isLocked())
But it may be better to find a more general solution that also considers the general problem of the ScopedRangeDeleterLock in SERVER-70322.
- duplicates
-
SERVER-70864 Get rid of fine grained scoped range deleter lock
- Closed