[SERVER-37348] TransactionReaper and periodic transaction abort thread shouldn't abort transactions on secondaries Created: 27/Sep/18 Updated: 29/Oct/23 Resolved: 25/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.9 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Matthew Saltz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | prepare_errors | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Sprint: | Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Description |
|
Periodic transaction killer kills unprepared transactions that run longer than 60 seconds. It shouldn't kill transactions on secondaries. Currently, it's very unlikely to have transactions in "kInProgress" state longer than 60 seconds on secondaries since such transactions become prepared right after applying all their write operations; this will be more likely when we start to support transactions consisting of multiple oplog entries. Session reaper and other periodic threads may have the same issue and need auditing. |
| Comments |
| Comment by Githook User [ 25/Feb/19 ] |
|
Author: {'name': 'Matthew Saltz', 'username': 'saltzm', 'email': 'matthew.saltz@mongodb.com'}Message: |
| Comment by Matthew Saltz (Inactive) [ 17/Jan/19 ] |
|
Things to do for this ticket (per discussion): |
| Comment by Judah Schvimer [ 17/Jan/19 ] |
|
Per discussion, this can be fixed by making secondary oplog application batches uninterruptible. |
| Comment by Siyuan Zhou [ 16/Jan/19 ] |
|
That's an interesting idea. If we mark transactions in secondary mode, we need to do that whenever we enter secondary mode, unmark it on stepup. Instead, we can enable the transaction reaper on stepup and disable it on stepdown, like what we did for migration manager, assuming the lifecycle of a transaction is entirely managed by the primary (which I believe is true). The transactions for dbhash allowed on secondaries may make things different. Probably, we just leave them alone, since they are not allowed in production? For the transaction reaper, another option is to let it run all the time, but check whether we are master before killing anything under the RSTL lock. However, the lifecycle of session isn't very clear to me. I have an impression that it is orthogonal to transactions and can be used for other purposes on both primary and secondary, but it actually cleans up the session / transaction participant and writes into the transaction table, which affects the lifecycle of a transaction. I guess sharding team can shed some light on that. |
| Comment by Matthew Saltz (Inactive) [ 15/Jan/19 ] |
|
Is it true generally speaking that we don't want to be able to kill a transaction at all that's on a secondary? I'm wondering if it would make sense if, when we start a transaction on the secondary here, we tag the transaction as unkillable due to being on a secondary (or something like that). |