-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Catalog and Routing
-
ALL
-
2
-
None
-
3
-
TBD
-
🟩 Routing and Topology
-
None
-
None
-
None
-
None
-
None
-
None
The RetryPolicy for non-idempotent operations will retry on any NotPrimary error, assuming that if the remote was not a primary no write could have occurred. This is a bit concerning to me for a few reasons:
- The NotPrimaryError category is documented as not being sufficient to determine if a write has actually occurred, and that individual codes must be inspected to determine that.
- not all non-idempotent operations are writes. e.g. if a getMore receives an error like "InterruptedDueToReplStateChange" (assuming it can), are we sure the server-side cursor metadata wasn't modified? This case seems unlikely to cause an issue in practice, but is worth considering.
Â
In light of this, we should consider being less permissive with the kNotIdempotent retry policy by default and only retry cases that are truly safe.
- is related to
-
SERVER-108318 Introduce new error label indicating a failure is unconditionally retryable
-
- In Code Review
-
-
SERVER-42908 Add ErrorCodes to retryable errors to match drivers
-
- Closed
-
-
SERVER-66479 Create an error label indicating if a retryable error is "definite".
-
- Closed
-