[SERVER-36589] Implement mongos abort logic Created: 10/Aug/18  Updated: 29/Oct/23  Resolved: 26/Sep/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.4

Type: Task Priority: Major - P3
Reporter: Jack Mulrow Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: ShardedTxn:RouterSupport
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by DRIVERS-522 Support mongos pinning for sharded tr... Closed
Gantt Dependency
has to be done before SERVER-37210 Mongos should implicitly abort on err... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-08-27, Sharding 2018-09-10, Sharding 2018-09-24, Sharding 2018-10-08
Participants:

 Description   

Currently, mongos aborts transactions by sending abortTransaction to all shards, since this was sufficient for the single shard case. Instead, it should be changed to only target the shards in the participant list that have been successfully sent startTransaction=true.

Mongos should also remember if a txnId (lsid/txnNumber pair) is aborted, so it won't accept more statements for a transaction after it chooses to abort.



 Comments   
Comment by Githook User [ 01/Oct/18 ]

Author:

{'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}

Message: SERVER-36589 Remove unused reference to s/transactions/SConscript
Branch: master
https://github.com/mongodb/mongo/commit/5090bdb9b2d87dc77954352d3124cbab4806b0a5

Comment by Githook User [ 26/Sep/18 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-36589 Implement mongos abort logic
Branch: master
https://github.com/mongodb/mongo/commit/071521f35b79c8e3c280cfade3b621d61243f3bc

Comment by Githook User [ 26/Sep/18 ]

Author:

{'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}

Message: SERVER-36589 Reorganize libraries and fold s/transaction to s/
Branch: master
https://github.com/mongodb/mongo/commit/625fa16dff719dbf6688af209c5f31913d1e794f

Comment by Jack Mulrow [ 05/Sep/18 ]

I think it should sent abort regardless of whether mongos has been successfully startTxn or not since it cannot guarantee that the shard failed to execute when mongos gets an error.

Sounds good to me, as long as we stop sending abort to all shards in the cluster like we do now. That's all I really meant in the description.  

since it will never succeed in running on the shards that aborted

You may not have seen this, but to allow internal retries by mongos, I had to make shards accept startTransaction=true at their active transaction number, even if the shard aborted that transaction number already (SERVER-36590). I think you're still correct though as long as we don't clear the participant list when we abort, because if we try to contact a shard we already aborted on, we won't send startTransaction=true and the shard will reject the statement. I guess there's no reason to clear the participant list on abort, so I agree it's not necessary for this ticket.

 

Comment by Randolph Tan [ 05/Sep/18 ]

Instead, it should be changed to only target the shards in the participant list that have been successfully sent startTransaction=true

I think it should sent abort regardless of whether mongos has been successfully startTxn or not since it cannot guarantee that the shard failed to execute when mongos gets an error.

Mongos should also remember if a txnId (lsid/txnNumber pair) is aborted, so it won't accept more statements for a transaction after it chooses to abort.

I'm not fully convinced that this is 100% correct. On the other hand, I believe that it is safe allowing new statements to get executed since it will never succeed in running on the shards that aborted and the whole transaction will never be able to commit if at least one participant aborted already. I'm going to defer this "nice to have" optimization for later work.

Generated at Thu Feb 08 04:43:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.