[SERVER-41976] Server should not attach TransientTransactionError label to prepared transaction commands (commit) when the command fails with LockTimeout. Created: 27/Jun/19  Updated: 27/Oct/23  Resolved: 22/Jul/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Backlog - Replication Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-41556 Must handle failure to reacquire lock... Closed
Assigned Teams:
Replication
Participants:

 Description   

Currently when a commit command tries to commit a prepared transaction but it failed while unstashing the lock resource due to LockTimeout, then we are attaching a TransientTransactionError label. This means driver will retry the whole transactions and it's an undesired behavior.



 Comments   
Comment by Ratika Gandhi [ 22/Jul/19 ]

Not a bug.

Comment by Suganthi Mani [ 08/Jul/19 ]

When a commitTransaction cmd fails while committing a prepared transaction, the error response w/ TransientTransactionError label attached will be sent only to Transaction coordinator. And, the current behavior is that transaction coordinator will retry the commitTransaction cmd  indefinitely till it succeeds. So, no way, the error response of a commitTransaction cmd  for cross-shard transactions will reach the drivers.

When a commitTransaction cmd fails while committing an unprepared transaction, the error response w/ TransientTransactionError  label attached will reach the drivers. And, its safe for drivers to retry the transaction again with higher transaction number as they would retry with a next higher txn number and that would result in aborting the previously failed commitTransaction cmd or the transaction reaper would abort the failed transaction.

Basically this is a harmless ticket. So, we can deprioritize this ticket.

Comment by Esha Maharishi (Inactive) [ 01/Jul/19 ]

Ah, I see. Yes, I agree LockTimeout should not be a transient error for commitTransaction.

Comment by Judah Schvimer [ 01/Jul/19 ]

This is about committing a prepared transaction, not preparing a transaction. A TransientTransactionError label means that the transaction definitively aborted. A LockTimeout does not mean that the transaction definitively aborted.

In fact it should not be possible for a commitTransaction command on a prepared transaction to return a TransientTransactionError label, and as part of this ticket we should add an assertion such that we fail loudly (at least in our tests) if that happens.

Comment by Esha Maharishi (Inactive) [ 28/Jun/19 ]

Just curious, why is it undesired behavior?

I think today it actually doesn't matter what error labels prepareTransaction attaches - the coordinator will not propagate the prepare response's label back to the client.

Generated at Thu Feb 08 04:59:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.