[SERVER-35707] Figure out the transaction abort state on re-targeting exceptions Created: 20/Jun/18  Updated: 29/Oct/23  Resolved: 19/Sep/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.1.1
Fix Version/s: 4.1.4

Type: Task Priority: Major - P3
Reporter: Randolph Tan Assignee: Jack Mulrow
Resolution: Fixed Votes: 0
Labels: ShardedTxn:RouterSupport
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done before SERVER-36312 Re-enable atClusterTime selection alg... Closed
has to be done after SERVER-36590 Allow shards to start new transaction... Closed
Related
related to SERVER-37207 Only retry failed writes in a batch o... Backlog
related to SERVER-37209 Allow mongos to retry on view resolut... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-08-13, Sharding 2018-08-27, Sharding 2018-09-10, Sharding 2018-09-24
Participants:

 Description   

Original Description

Single replica-set transactions currently abort transactions unconditionally when an exception occurs. The reason is that the transaction could be holding to a WUOW which needs to get cleaned up. The abort happens here.

This is problematic for some of the sharding machinery that uses exceptions as control flow. Examples include StaleConfigException, CommandOnShardedViewNotSupportedOnMongod, SnapshotTooOld (this makes readConcern: snapshot effectively unusable).

New Description

SERVER-36591 handles retries on snapshot errors, so this ticket will track retries on all other re-targeting errors, i.e. CommandOnShardedViewNotSupportedOnMongod, StaleConfigException, and CannotImplicitlyCreateCollection.

Mongos should be allowed to retry on each of these errors, picking a new atClusterTime only during the first overall statement in the transaction, otherwise using the immutable atClusterTime established during the first statement. Any shards newly added by this statement must include startTransaction=true on its retries, not just the first request sent to them. If mongos exhausts its allowed retry attempts and any of these errors is returned to the client, the response should include the TransientTransactionError label.

Shards can also be modified to only abort their local transaction on these errors if they are encountered on the first statement that shard has seen.



 Comments   
Comment by Githook User [ 20/Sep/18 ]

Author:

{'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}

Message: SERVER-35707 Add helper for clearing participants, improve error messages, and update comments
Branch: master
https://github.com/mongodb/mongo/commit/2f58283213f8a80a37f78b6de2c527951306f2b5

Comment by Jack Mulrow [ 19/Sep/18 ]

Closing this ticket as fixed because retrying on view errors will be tracked by SERVER-37209 and no work is necessary for CannotImplicitlyCreateCollection because shards can't return that error in a transaction - they will throw OperationNotSupportedInATransaction first, already verified by no_implicit_collection_creation.js which runs in the sharded_core_txns suite.

Comment by Githook User [ 19/Sep/18 ]

Author:

{'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}

Message: SERVER-35707 Allow mongos to retry on re-targeting errors in a transaction
Branch: master
https://github.com/mongodb/mongo/commit/3bab189695c705ff163721652add910b32c2659e

Generated at Thu Feb 08 04:40:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.