[SERVER-75285] Deadlock between ShardsvrCheckMetadataConsistencyParticipantCommand, prepared transactions, and stepdown Created: 24/Mar/23  Updated: 27/Oct/23  Resolved: 27/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Samyukta Lanka Assignee: Tommaso Tocci
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-75288 Investigate whether the stepdown kill... Open
is related to SERVER-72895 Implement shardKey index check in che... Closed
is related to SERVER-74667 Use lock-free read approch for checkM... Closed
Assigned Teams:
Sharding EMEA
Operating System: ALL
Participants:
Linked BF Score: 135

 Description   

ShardsvrCheckMetadataConsistencyParticipantCommand currently takes a DB lock in S mode IS mode without exempting taking the RSTL. This means that it will not be killed on stepdown (since it didn't take the global lock in a mode that conflicts with writes).

(Edit: at the time that this deadlock was found, the command took the DB lock in S mode).

This can then cause a deadlock with prepared transactions if the transaction is holding the DB lock that checkMetadataConsistency is looking to acquire, but committing the transaction is blocked on a stepdown (as in the node isn't able to replicate the commitTransaction command until it completes stepping down).

The order of events is:
1. Prepare a transaction that holds the DB lock in IX for some db that checkMetadataConsistency might need to take a DB lock for
2. ShardsvrCheckMetadataConsistencyParticipantCommand tries to take the DB lock for the db mentioned above, ends up holding the RSTL in IX mode while it waits
3. Node tries to step down before it receives the commitTransaction command

A targeted way to fix this would be to manually ensure that checkMetadataConsistency is killed by the stepdown thread or make sure it does not hold the RSTL.



 Comments   
Comment by Samyukta Lanka [ 27/Mar/23 ]

from your explanation it seems like read operations must always be lock-free - am I understanding it correctly?

I think an amendment based on Jordi's point is that reads that take DB S mode locks should instead be lock free or we do SERVER-75288.

Comment by Samyukta Lanka [ 27/Mar/23 ]

That's a great point, I think jordi.serra-torrens@mongodb.com is correct that this can't happen anymore because the IS lock won't conflict with prepared transactions.

Comment by Jordi Serra Torrens [ 27/Mar/23 ]

I'd like to point out that on BF-28038, ShardsvrCheckMetadataConsistencyParticipantCommand was trying to acquire the DB lock in MODE_S (rather than IS). The change from S to IS happened as part of SERVER-74667.

I think that's important, because I wouldn't expect ShardsvrCheckMetadataConsistencyParticipantCommand's MODE_IS acquisition to be blocked due to the prepared txn (MODE_IX). MODE_S however, would have blocked.

Comment by Kaloian Manassiev [ 27/Mar/23 ]

samy.lanka@mongodb.com, from your explanation it seems like read operations must always be lock-free - am I understanding it correctly?

My reading of it is that read operations shouldn't be holding the RSTL lock while waiting for IS locks further down the hierarchy. But that would only be possible if we had some snapshotting mechanism to ensure the read will access a consistent state, i.e. what is present in lock-free reads.

Do we have an example of a read operation which must run with locks held?

Generated at Thu Feb 08 06:29:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.