[SERVER-84468] Fix deadlock when running runTransactionOnShardingCatalog() Created: 02/Jan/24  Updated: 03/Jan/24  Resolved: 03/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.0, 7.3.0-rc0, 7.2.0
Fix Version/s: 7.2.1, 7.3.0-rc0, 7.0.6

Type: Bug Priority: Major - P3
Reporter: Silvia Surroca Assignee: Silvia Surroca
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
is caused by SERVER-76749 runTransactionOnShardingCatalog shoul... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.2, v7.0
Sprint: CAR Team 2024-01-08
Participants:
Linked BF Score: 26

 Description   

The function runTransactionOnShardingCatalog() is called when a config server operation needs to execute an internal transaction.

This function creates a new OperationContext under an AlternativeClientServer that is set as interruptible a few lines below its creation.

If the OperationContext got killed by the stepdown thread right before setting it as interruptible, we would end up in a deadlock. This is the event sequence for a deadlock:

 

Under runTransactionOnShardingCatalog(), create a new OperationContext.

Step-down thread kicks in and kills the recently created OperationContext, but it's not killed because it doesn't meet the conditions to be killed.

The new OperationContext is set as interruptible (but late).

The internal transaction checks out a session.

Step-down thread acquires the RSTL lock.

Step-down thread checks out all the active sessions to kill them. Gets stuck here since one session is still checked out by the non-interrupted thread.

The internal transaction tries to get the RSTL lock here and gets stuck.

 



 Comments   
Comment by Githook User [ 03/Jan/24 ]

Author:

{'name': 'Silvia Surroca', 'email': 'silvia.surroca@mongodb.com', 'username': 'silviasuhu'}

Message: SERVER-84468 Fix deadlock when running runTransactionOnShardingCatalog() (#17821)

(cherry picked from commit 30410c6458fc965ae1277e78a2313e59a4e5d24f)

GitOrigin-RevId: c5231702cf05da78b094eb351d9e2dff95538113
Branch: v7.0
https://github.com/mongodb/mongo/commit/43d9974fe17b936beadf8179f8dc65e0fbbcba8e

Comment by Githook User [ 03/Jan/24 ]

Author:

{'name': 'Silvia Surroca', 'email': 'silvia.surroca@mongodb.com', 'username': 'silviasuhu'}

Message: SERVER-84468 Fix deadlock when running runTransactionOnShardingCatalog() (#17821)

(cherry picked from commit 30410c6458fc965ae1277e78a2313e59a4e5d24f)
Branch: v7.2
https://github.com/mongodb/mongo/commit/40808dc197f45300468e3e0f8802b3740516b0fd

Comment by Githook User [ 02/Jan/24 ]

Author:

{'name': 'Silvia Surroca', 'email': 'silvia.surroca@mongodb.com', 'username': 'silviasuhu'}

Message: SERVER-84468 Fix deadlock when running runTransactionOnShardingCatalog() (#17821)

GitOrigin-RevId: 30410c6458fc965ae1277e78a2313e59a4e5d24f
Branch: master
https://github.com/mongodb/mongo/commit/5af259b4e6dde01213b6108c2933397855634290

Generated at Thu Feb 08 06:55:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.