[SERVER-67820] SEPTransactionClient::runCRUDOp should inspect writeError from response Created: 06/Jul/22  Updated: 29/Oct/23  Resolved: 11/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Sanika Phanse (Inactive)
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam3, txn-api-improvements
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2022-08-08, Sharding 2022-08-22
Participants:

 Description   

Currently it only checks the top level error fields from the command response here. It should be calling getStatusFromWriteCommandReply instead.



 Comments   
Comment by Jack Mulrow [ 22/Jul/22 ]

That argument sounds reasonable to me. Going ahead with the change described in this ticket SGTM.

Comment by Randolph Tan [ 11/Jul/22 ]

jack.mulrow@mongodb.com, I think this would make the api more cumbersome to use. This is because the caller would be required to check for write errors for every call to runCRUDOp even though the underlying transaction has already been aborted. If the caller didn't check for write errors, this can also end up getting stuck retrying because the api would try to do txn stuff on a txnNumber that was already aborted and get a txnNumber no longer exist error, retry from the beginning and get the same error again and again.

Comment by Jack Mulrow [ 08/Jul/22 ]

This was actually the intended behavior. I wasn't sure if users of the API could hit cases with either expected errors (like WouldChangeOwningShard) or multiple write errors, so I didn't want runCRUDOp to always throw on any write error and deferred that to the caller instead. I figured always checking the top-level error was fine though since that should mean the whole command failed, which I couldn't imagine being an expected error.

That said, this might not have been the right choice and I'm open to changing it (or maybe making this behavior more obvious somehow), I just wanted to give some context.

Generated at Thu Feb 08 06:09:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.