[SERVER-48399] Writing config document to "local.system.replset" should not acquire database lock in stronger mode (X). Created: 26/May/20  Updated: 16/Feb/23  Resolved: 16/Feb/23

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Suganthi Mani Assignee: Backlog - Replication Team
Resolution: Duplicate Votes: 0
Labels: former-quick-wins
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-72897 Investigate unorthodox locking in Rep... Closed
Related
related to SERVER-50519 resumable index build hangs waiting f... Closed
related to SERVER-38341 Remove Parallel Batch Writer Mutex Closed
related to SERVER-48398 Writing config document to "local.sys... Backlog
related to SERVER-42055 Only acquire a collection IX lock to ... Closed
Assigned Teams:
Replication
Participants:
Linked BF Score: 5

 Description   

It seems when the node persists new config in collection "local.system.replset", it takes "local" database lock in stronger mode (X) and PBWM lock in IS mode. This can lead to 2 major side effects.
1) Since PBWM lock is taken in IS mode, this can block the secondary oplog applier which requires PBWM in X mode. This can result in replication lag. This will be addressed by SERVER-48398
2) Since it takes "local" database lock in X mode, this can block other local database readers and writers.

  • Mainly, if this node X is the sync source for node Y, then the oplog fetcher of the node Y can be blocked behind the the reconfig via heartbeat thread due to database lock conflict, leading to replication lag.


 Comments   
Comment by Lingzhi Deng [ 16/Feb/23 ]

Closing as a dup of SERVER-72897

Comment by Suganthi Mani [ 26/May/20 ]

The fix should be something similar to SERVER-42055. Possibly, as part of this ticket we should also audit if there are any methods besides storeLocalConfigDocument takes stronger lock on "local" database.

Generated at Thu Feb 08 05:17:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.