[SERVER-57602] Deadlock between signalDrainComplete and operations acquiring the FCV lock Created: 10/Jun/21  Updated: 29/Oct/23  Resolved: 15/Jun/21

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 5.0.0-rc2, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Lingzhi Deng Assignee: Jason Chan
Resolution: Fixed Votes: 0
Labels: post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-46379 Implement upgrade/downgrade support f... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: Repl 2021-06-28
Participants:
Linked BF Score: 144

 Description   

ReplicationCoordinatorImpl::signalDrainComplete takes the RSTL lock in X mode first before acquiring the FCV lock for reconfig. Based on SERVER-33043, the FCV lock should be acquired before the global lock. I think this implies that the FCV lock should also be acquired before the RSTL lock. Because operation like this would take the FCV lock first before getting the global / RSTL locks. FixedFCVRegion has an invariant to check for that the global lock is not being held first. I think we should extend the invariant to check for RSTL lock as well.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 16/Jun/21 ]

Author:

{'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}

Message: SERVER-57602 Don't acquire FCV lock on reconfig triggered by signalDrainComplete()

(cherry picked from commit 4d53dbd076f5d63197c88fe9509038a4b3c90055)
Branch: v5.0
https://github.com/mongodb/mongo/commit/ace1746cc2640d5c862d9c7a4db7acd0c25eb75a

Comment by Githook User [ 16/Jun/21 ]

Author:

{'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}

Message: SERVER-57602 Don't acquire FCV lock on reconfig triggered by signalDrainComplete()
Branch: SERVER-34632
https://github.com/mongodb/mongo/commit/4d53dbd076f5d63197c88fe9509038a4b3c90055

Comment by Githook User [ 15/Jun/21 ]

Author:

{'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}

Message: SERVER-57602 Don't acquire FCV lock on reconfig triggered by signalDrainComplete()
Branch: master
https://github.com/mongodb/mongo/commit/4d53dbd076f5d63197c88fe9509038a4b3c90055

Generated at Thu Feb 08 05:42:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.