Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Won't Do
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- server-rapid-response-resolved

Assigned Teams:

Storage Execution
Operating System:
ALL
Backport Requested:

v8.0
Sprint:
Execution Team 2024-02-05, Execution Team 2024-02-19, Execution Team 2024-03-04, Execution Team 2024-03-18, Execution Team 2024-04-29, Execution Team 2024-06-10, Execution Team 2024-06-24
Linked BF Score:
200
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

When batched updates (w/multi false) performing non-idempotent writes fail with TenantMigrationCommitted or TenantMigrationAborted errors, the Atlas proxy should only auto-retry the unsuccessful update statements, not the entire update command. Otherwise, this could lead to duplicate commits. CLOUDP-77814 accidentally missed handling batched updates and deletes, only addressing batched inserts. To be noted, this a problem only if the batched updates are run outside of retryable writes. By default, the client drivers run with retryWrites:true. So, the severity is less for this case.

While investigating the BF, I found that the aggregation $merge is also auto-retried by the proxy, causing inconsistent results. $merge can perform non-transactional writes, so it's not safe to retry. For this reason, $merge is not a supported retryable write. (Note: $out is not supported by Serverless, but $merge is supported)

This problem has existed since the introduction of tenant migration in 5.0.

Assignee:: Suganthi Mani
Reporter:: Suganthi Mani
Participants:: Suganthi Mani, TPM Jira Automations Bot
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Nov 28 2023 08:00:07 PM UTC
Updated:: Aug 16 2024 08:36:06 AM UTC
Resolved:: Jun 17 2024 05:20:41 PM UTC

Details

Description

Attachments

Activity

People

Dates