[SERVER-62379] Fix deadlock between ReplicationCoordinator and BackgroundSync on stepUp Created: 05/Jan/22  Updated: 29/Oct/23  Resolved: 12/Jan/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.3.0, 5.0.7

Type: Bug Priority: Major - P3
Reporter: Moustafa Maher Assignee: Moustafa Maher
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-50869 Background sync may erroneously set a... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v5.2, v5.1, v5.0, v4.4, v4.2
Sprint: Replication 2022-01-24
Participants:
Linked BF Score: 123

 Description   

Bug:
there's a deadlock in mongod between:

 

Proposed fix:

We need to move  _replCoord->getMyLastAppliedOpTime() before we acquire the mutex.



 Comments   
Comment by Githook User [ 19/Feb/22 ]

Author:

{'name': 'Moustafa Maher Khalil', 'email': 'm.maher@mongodb.com', 'username': 'moustafamaher'}

Message: SERVER-62379 Fix deadlock between ReplicationCoordinator and BackgroundSync on stepUp
Branch: v5.0
https://github.com/mongodb/mongo/commit/45d84929f04b99b882195c3b6c0333c91438e108

Comment by Moustafa Maher [ 02/Feb/22 ]

Backport justification:
----------------------

The bug is not too rare not to do it, it seems like it could happen when a node is stepping up and starting a new oplog fetcher at the same time.
The change itself is very non-invasive because we are not changing an old behavior we are just fixing the way (SERVER-50869) works.

Comment by Moustafa Maher [ 21/Jan/22 ]

This needs to batched to all versions containing SERVER-50869.

Comment by Githook User [ 11/Jan/22 ]

Author:

{'name': 'Moustafa Maher Khalil', 'email': 'm.maher@mongodb.com', 'username': 'moustafamaher'}

Message: SERVER-62379 Fix deadlock between ReplicationCoordinator and BackgroundSync on stepUp
Branch: master
https://github.com/mongodb/mongo/commit/45188cb0dbf11e9559af07a910bedb8786372bc3

Generated at Thu Feb 08 05:54:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.