[SERVER-41126] Allow aborting a transaction on mongos even if the transaction has no participants Created: 13/May/19  Updated: 05/Jun/19  Resolved: 04/Jun/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Matthew Saltz (Inactive)
Resolution: Won't Fix Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Sharding 2019-06-17
Participants:

 Description   

TransactionRouter::abortTransaction asserts that there is at least one participant, but we allow committing without any participants.

If we allow one, we should allow the other.



 Comments   
Comment by Matthew Saltz (Inactive) [ 04/Jun/19 ]

That makes sense. Closing as suggested

Comment by Esha Maharishi (Inactive) [ 04/Jun/19 ]

matthew.saltz, now I realize there is a bit of a difference between having no participants for commitTransaction and abortTransaction.

The only way commitTransaction should be received is if all earlier statements were successful, so the only way to have no participants at commit time is if one of the statements was a read on a non-existent database.

However, abortTransaction can be received even if an earlier statement failed before targeting any participants. In fact, sending abort after a failed statement is part of the documented API.

Given this, I think we should make abortTransaction behave as expected in the vast majority of cases rather than in the special non-existent database case and close this as Won't Fix.

Comment by Matthew Saltz (Inactive) [ 04/Jun/19 ]

Two jstests are responsible for almost all of the failures so shouldn't be too bad to fix actually: First failing test. Second failing test.

Both cases are trying abort after another unsuccessful command and expecting to see NoSuchTransaction since that transaction has already failed, which actually sort of makes sense as a behavior. And in fact, I think the aggregation_in_transaction test also runs on single replica sets, so that allows the behavior to be consistent between the two. I think the only way to fix that then would be to just remove that assertion (which may be fine). So let me know what you think.

Comment by Matthew Saltz (Inactive) [ 04/Jun/19 ]

I tried this out quickly and even after fixing the unit tests there are a number of jstests which fail. Based on the earlier comment then, it seems like we shouldn't bother fixing this? esha.maharishi

Comment by Kaloian Manassiev [ 03/Jun/19 ]

I think the difference between a shard and a router transaction is that the router has the state "I received some transaction commands, but didn't find anywhere to route them" whereas there is no situation where a shard received something with beginTransaction, but didn't actually do anything (even if that "anything" is to just update the participant state).

So to answer your question - yes it is kind of accurate with the caveat above.

Comment by Matthew Saltz (Inactive) [ 03/Jun/19 ]

I noticed this comment says that this is mimicking the behavior on the shard. Do we know if that's still accurate?

Comment by Gregory McKeon (Inactive) [ 14/May/19 ]

If any tests fail, don't fix this.

Generated at Thu Feb 08 04:56:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.