[SERVER-39228] Mongos can track orphaned participant shards to avoid sending abort between retries of transaction statements Created: 28/Jan/19  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Jack Mulrow Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: ShardedTxn:FutureOptimizations, pm-564, sharding-common-backlog, sharding-nyc-subteam3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Participants:
Story Points: 5

 Description   

When mongos encounters an error in a transaction, if the error is "retryable" (e.g. snapshot error on first client statement), it will remove newly added participants from the participant list and send abortTransaction to each before retrying the failed statement. This guarantees no transactions are left open on these shards if they are not targeted by the retry. To prevent the retry from racing with the aborts, the router must wait for a response to abortTransaction from each cleared participant.

To avoid this delay, it's possible for the router to not abort and instead send the retry immediately after clearing the new participants from the participant list, relying on shards with unaborted transactions from the first attempt to implicitly abort their local transaction before servicing the retry (this behavior would need to be added). To guarantee no transactions are left open, the router should track all participants that were ever targeted, and send abortTransaction to those that were targeted but are not in the final participant list when the transaction reaches a terminal state, i.e. commit, abort, or implicit abort.


Generated at Thu Feb 08 04:51:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.