[SERVER-37199] Yield locks of transactions in secondary application Created: 19/Sep/18  Updated: 29/Oct/23  Resolved: 03/Dec/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.1.6

Type: Task Priority: Major - P3
Reporter: Siyuan Zhou Assignee: Siyuan Zhou
Resolution: Fixed Votes: 0
Labels: prepare_durability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-38282 Reacquire locks for transactions on s... Closed
Duplicate
is duplicated by SERVER-38121 multikey index ops in a transaction c... Closed
Problem/Incident
causes SERVER-38588 Hybrid index builds do not work when ... Closed
Related
related to SERVER-37336 Test that background index build do n... Closed
related to SERVER-39372 Make secondary lock acquisition for D... Closed
related to SERVER-40723 Deadlock between S lock acquisition o... Closed
related to SERVER-39424 Test that DDL operations can't succee... Closed
related to SERVER-37988 recover locks on step up at the begin... Closed
is related to SERVER-37313 FTDC collection blocked during foregr... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2018-10-08, Repl 2018-10-22, Repl 2018-11-05, Repl 2018-11-19, Repl 2018-12-03, Repl 2018-12-17
Participants:
Linked BF Score: 62

 Description   

Secondary application tends to acquire locks conservatively, for example, all commands acquire global write lock. This will conflict with prepared transactions. We can yield locks of transactions on secondary since the oplog should include no conflicting operations due to the concurrency control on primary. 

An alternative solution is to have secondaries acquire the same locks as the primary, but yielding locks will also fix other issues, e.g. SERVER-38121



 Comments   
Comment by Githook User [ 03/Dec/18 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-37199 Yield locks of transactions in secondary application.
Branch: master
https://github.com/mongodb/mongo/commit/55e72b015e2aa7297c00db29e4d93451ea61a7ca

Comment by Judah Schvimer [ 15/Nov/18 ]

I would rather yield locks and reacquire them. It makes me less nervous about accidentally allowing in readers without generating prepare conflicts. That said, I agree solution (1) would work. I don't follow how parallel application would be made more difficult by yielding locks though.

Comment by Siyuan Zhou [ 15/Nov/18 ]

judah.schvimer, I see the point of recovering locks for step-up. There’re two solutions: 1) drop locks for prepared transaction on secondary, then abort and reapply them on step-up. 2) yield locks for prepared transactions on secondary, then resume the locks on step-up. The first one is easier on secondary application, but harder on step-up. The second solution has more work to do on secondary application, but less on step-up. When we have beyond 16MB transactions, yielding and restoring locks will be needed for all transactional operations in solution #2, while solution #1 makes things simpler by dropping the locks.

milkie and I thought solution #1 was easier without considering the state transition. Now I'm leaning towards solution #2 due to performance concerns of solution #1 on step-up and its complexity of reapplication.

However, if secondary application is going to differ from primary's behavior further when we apply transactions in parallel (e.g they stash ops in different ways), it might be more straightforward to re-apply the ops rather than recovering them on state transitions.

Comment by Judah Schvimer [ 14/Nov/18 ]

Prepared transactions will need to recover their locks on step up while the RSTL is held. To make step-up writes (mainly dropping temporary collections) not conflict with prepared transactions, we should recover the prepared transaction locks at the very end of the step up, right before releasing the RSTL.

Comment by Geert Bosch [ 06/Nov/18 ]

siyuan.zhou, I confirmed for listCollections. We still have a few commands that take MODE_S locks, such as dbStats and dbHash as Tess mentions. We're planning to get rid of these as well.

Comment by Tess Avitabile (Inactive) [ 06/Nov/18 ]

We still use S mode for dbhash here. However, I don't think it's important that prepared transactions conflict with dbhash on secondaries.

Comment by Siyuan Zhou [ 06/Nov/18 ]

Discussed an alternative solution with geert.bosch: we may yield locks after transactions get prepared on secondaries, so that other commands don't conflict with them, because the concurrency control on primary guarantees there's no conflict if the ops are applied in the oplog order. One concern from Judah was the behavioral change of operations that hold locks in S mode on secondaries, which conflict with prepared transactions' IX locks. According to geert.bosch, we've removed all S mode locks on master. For example, listCollections needs a DB lock in IS mode and then each collection lock in IS mode, rather than a DB lock in S mode. However,  to support the yielding behavior, we probably need to change WriteUnitOfWork to introduce a new prepared state for a recovery unit. WriteUnitOfWork is a RAII type designed to represent the two-phase locking, yielding violates that semantics.

That being said, we'll keep investigating the original solution. After all, it makes the system easier to reason about if operations on secondaries acquire the same locks as on the primary.

Comment by Siyuan Zhou [ 19/Sep/18 ]

We should also audit the applyOps command's oplog entry path to make sure there's no ways to generate oplog entries that lock Global during oplog application.

Generated at Thu Feb 08 04:45:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.