[SERVER-46664] runCmdOnPrimaryAndAwaitResponse() should not run DBDirect client command with the rstl lock held. Created: 05/Mar/20  Updated: 29/Oct/23  Resolved: 09/Mar/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.4.0-rc0, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Backlog - Storage Execution Team
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-46663 runCmdOnPrimaryAndAwaitResponse() sho... Closed
Related
is related to SERVER-46704 Two phase index build can violate loc... Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Execution Team 2020-03-09
Participants:
Linked BF Score: 50

 Description   

Currently runCmdOnPrimaryAndAwaitResponse() takes RSTL lock in IX mode and performs DBDirectClient command using AlternativeClientRegion. If the DBDirectClient command takes RSTL lock in the AlternativeClientRegion's opCtx, then it can lead to deadlock involving the stepdown thread and the thread that calls runCmdOnPrimaryAndAwaitResponse().
The reason for the deadlock is that, the stepdown thread can't acquire RSTL lock as the caller's original opCtx (including locks) has been stashed and replaced with a new opCtx by runCmdOnPrimaryAndAwaitResponse() using AlternativeClientRegion class. So, no way stepdown can interrupt and make the original opCtx to release the locks. As a result, StepDown thread gets blocked behind runCmdOnPrimaryAndAwaitResponse() RSTL lock due to lock conflict . DBDirectClient command gets blocked behind stepdown as stepdown has enqueued the RSTL lock in X mode. And, runCmdOnPrimaryAndAwaitResponse() will be waiting for the DBDirectClient command's response.



 Comments   
Comment by Githook User [ 11/Mar/20 ]

Author:

{'name': 'Suganthi Mani', 'username': 'smani87', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-46664 runCmdOnPrimaryAndAwaitResponse() should not run DBDirect client command with the rstl lock held.

(cherry picked from commit 997841bdace7ff9ed5fd5bd0f952ec20880b9d92)
Branch: v4.4
https://github.com/mongodb/mongo/commit/5b01fd594e0b75182c212f39c59d25d15d92bc0b

Comment by Githook User [ 09/Mar/20 ]

Author:

{'name': 'Suganthi Mani', 'username': 'smani87', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-46664 runCmdOnPrimaryAndAwaitResponse() should not run DBDirect client command with the rstl lock held.
Branch: master
https://github.com/mongodb/mongo/commit/997841bdace7ff9ed5fd5bd0f952ec20880b9d92

Generated at Thu Feb 08 05:12:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.