-
Type:
Bug
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Execution
-
ALL
-
v8.0
-
Execution Team 2024-02-05, Execution Team 2024-02-19, Execution Team 2024-03-04, Execution Team 2024-03-18, Execution Team 2024-04-29, Execution Team 2024-06-10, Execution Team 2024-06-24
-
200
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
When batched updates (w/multi false) performing non-idempotent writes fail with TenantMigrationCommitted or TenantMigrationAborted errors, the Atlas proxy should only auto-retry the unsuccessful update statements, not the entire update command. Otherwise, this could lead to duplicate commits. CLOUDP-77814 accidentally missed handling batched updates and deletes, only addressing batched inserts. To be noted, this a problem only if the batched updates are run outside of retryable writes. By default, the client drivers run with retryWrites:true. So, the severity is less for this case.
While investigating the BF, I found that the aggregation $merge is also auto-retried by the proxy, causing inconsistent results. $merge can perform non-transactional writes, so it's not safe to retry. For this reason, $merge is not a supported retryable write. (Note: $out is not supported by Serverless, but $merge is supported)
This problem has existed since the introduction of tenant migration in 5.0.