[DRIVERS-2327] Propagate Original Error for Write Errors Labeled NoWritesPerformed Created: 12/May/22  Updated: 27/Jan/23

Status: Implementing
Project: Drivers
Component/s: Retryability
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Preston Vasquez
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-66479 Create an error label indicating if a... Closed
is depended on by SERVER-66116 Aborted Read with MongoNotPrimaryExce... Blocked
Issue split
split to CSHARP-4288 Propagate Original Error for Write Er... Backlog
split to CXX-2563 Propagate Original Error for Write Er... Backlog
split to CDRIVER-4449 Propagate Original Error for Write Er... Closed
split to GODRIVER-2516 Propagate Original Error for Write Er... Closed
split to MOTOR-1013 Propagate Original Error for Write Er... Closed
split to PHPLIB-935 Propagate Original Error for Write Er... Closed
split to PYTHON-3388 Propagate Original Error for Write Er... Closed
split to RUBY-3079 Propagate Original Error for Write Er... Closed
split to RUST-1433 Propagate Original Error for Write Er... Closed
split to JAVA-4701 Propagate Original Error for Write Er... Closed
split to NODE-4503 Propagate Original Error for Write Er... Closed
Related
related to DRIVERS-2501 Break NoWritesPerformed-Only Error Se... Implementing
related to SERVER-69129 Successive Errors in FailCommand Backlog
related to PYTHON-1573 db.collection.bulkWrite() does not r... Backlog
is related to SERVER-69295 Mongos NoWritesPerformed errors MUST ... Backlog
is related to DRIVERS-2468 Add a test that drivers emit a Comman... Implementing
Driver Changes: Needed
Server Compat: 6.1
Quarter: FY23Q3
Upstream Changes Summary:

Commands which failed with the RetryableWriteError label can now also return a new NoWritesPerformed error label if no writes were performed during the operation of that command.

Downstream Changes Summary:

Drivers need to implement a new retryable writes prose test (see mongodb/specifications@e4a5564), and implement operation error handling such that if a retry event fails with an error labeled NoWritesPerformed the original error is propagated.

Driver Compliance:
Key Status/Resolution FixVersion
CDRIVER-4449 Fixed 1.24.0
CXX-2563 Backlog
CSHARP-4288 Backlog
GODRIVER-2516 Done 1.11.0
JAVA-4701 Done 4.8.0
NODE-4503 Done 4.11.0
MOTOR-1013 Duplicate
PYTHON-3388 Fixed 4.4
PHPLIB-935 Fixed 1.16.0
RUBY-3079 Fixed 2.19.0
RUST-1433 Fixed 2.7.0
SWIFT-1624 Won't Do

 Description   

Summary

What is the problem or use case, what are we trying to achieve?

We have definite (write definitely aborted) and indefinite (write may have committed) errors we return to the client and we do not make it clear which are definite and which are indefinite.

We want to change the driver spec to return indefinite errors any time it could be indefinite. For example, WriteConcernErrors and SocketExceptions are both indefinite. The Server is in the best position to say (probably via an error label) which errors returned by the server are definite. On bulk writes, NotWritablePrimary is the only definite retryable error. NoSuchTransaction is also definite and handled specially by the driver, though it is not labeled “retriable”. Writes with multi:true are not retryable and thus there is no question about the error a driver should return.

Error labels were introduced in 4.3.1 by SERVER-43941. For brevity, the prose tests must only run on versions >= 6.0.

Motivation

Who is the affected end user?

Who are the stakeholders?
Anyone using retryable writes or transaction retryability.

How does this affect the end user?

Are they blocked? Are they annoyed? Are they confused?
If the driver stops retrying for them, they could take an incorrect action and accidentally double commit a write.

How likely is it that this problem or use case will occur?

Main path? Edge case?
This is on the main error path, but with CSOT will become less likely.

If the problem does occur, what are the consequences and how severe are they?

Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?
Users may take incorrect action based on unclear information we provide.

Is this issue urgent?

Does this ticket have a required timeline? What is it?

Is this ticket required by a downstream team?

Needed by e.g. Atlas, Shell, Compass?

Is this ticket only for tests?

Does this ticket have any functional impact, or is it just test improvements?
This has a functional impact.



 Comments   
Comment by Valentin Kavalenka [ 12/Oct/22 ]

For future readers: this issue is caused by SERVER-66116.

Comment by Rachelle Palmer [ 28/Sep/22 ]

Nope, that's good enough for me

Comment by Neil Shweky (Inactive) [ 28/Sep/22 ]

rachelle.palmer@mongodb.com I have merged a PR requiring 6.0+, does this need to be changed?

Comment by Githook User [ 13/Sep/22 ]

Author:

{'name': 'Neil Shweky', 'email': 'neilshweky@gmail.com', 'username': 'Neilshweky'}

Message: DRIVERS-2327: limit prose test to server versions 6.0+ (#1304)

Comment by Githook User [ 01/Sep/22 ]

Author:

{'name': 'Preston Vasquez', 'email': '24281431+prestonvasquez@users.noreply.github.com', 'username': 'prestonvasquez'}

Message: DRIVERS-2327 NoWritesPerformed Retryable Writes Error Handling (#1297)
Branch: master
https://github.com/mongodb/specifications/commit/e4a5564a157cd877b09b52cc467988eb44818021

Generated at Thu Feb 08 08:25:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.