-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Component/s: Causal Consistency, Retryability, Sessions
-
None
-
Needed
-
Drivers Changes
Drivers should sync the new unified sessions tests in mongodb/specifications@5a15b65. Drivers that enable `causalConsistency` in implicit sessions by default will require code changes. Note that `causalConsistency` must continue to be enabled by default in explicit sessions.
Summary
Read concerns "linearizable" and "available" cannot be used in causally consistent sessions (see documentation here; see server code here). However, the Causal Consistency spec says that causal consistency should be enabled in sessions by default unless snapshot=true. Enabling causal consistency in implicit sessions can cause server errors when a user sets read concern "linearizable" or "available".
A specific case where this causes a problem is with retryable reads because they reuse the same session and set an "operation time". As a result, a retried read will send a read concern document that includes afterClusterTime. If a user has also set the read concern to "linearizable" or "available", that retried read will fail with error InvalidOptions(72) with message:
afterClusterTime field can be set only if level is equal to majority, local, or snapshot
Update the Causal Consistency spec to prevent read operations that use implicit sessions from sending a read concern document with field afterClusterTime when field level is not "majority", "local", or "snapshot". We can accomplish that by either requiring that drivers set causalConsistency=false for all implicit sessions or for implicit sessions where the read concern is not "majority", "local", or "snapshot".
Open questions:
- Does sending readConcern.afterClusterTime with a retried read make the read result more correct? If so, should we still require that drivers create implicit sessions with causalConsistency=true only if the read concern is "majority", "local", or "snapshot"?
- Does using implicit sessions with causalConsistency=true for writes have any effect (positive or negative) on retryable writes?
Example repro steps
- Create a Client that uses read concern "linearizable".
- Set a failpoint that returns error code ShutdownInProgress(91) for a "find" operation one time.
- Run a "find" operation on a collection.
We expect that the "find" operation to be retried once and then successfully return results. What actually happens is the server responds the first time with the failpoint error ShutdownInProgress(91), then when the "find" is retried the server responds with error InvalidOptions(72) with message:
afterClusterTime field can be set only if level is equal to majority, local, or snapshot
Additional references to causal consistency that may require updating
The Retryable Reads specification mentions in question Can drivers resend the same wire protocol message on retry attempts?:
2. If the initial attempt failed with a server error, then the session's operationTime would be advanced and the next read would include a larger readConcern.afterClusterTime.
If we chose to disable causal consistency in implicit sessions, retried reads will not send readConcern.afterClusterTime. We should update that answer section to describe the updated behavior.
Motivation
Who is the affected end user?
Users who set read concern "linearizable" or "available" and leave retryable reads enabled (enabled by default).
How does this affect the end user?
Retried read operations will fail with server error InvalidOptions(72) with message:
afterClusterTime field can be set only if level is equal to majority, local, or snapshot
How likely is it that this problem or use case will occur?
If a user sets read concern "linearizable" or "available" and doesn't explicitly disable retryable reads, any retried read will return an error.
If the problem does occur, what are the consequences and how severe are they?
Retried read operations that may have succeeded will always return an error instead of a result. In affected drivers, retryable reads do not work with read concern "linearizable" or "available".
Is this issue urgent?
This bug affects at least the Go driver and could affect other drivers as well. The Java driver sets causalConsistency=false for all implicit sessions and is not affected.
The bug in the Go driver affects the mongosync project and is moderately urgent. The team maintaining mongosync does have a workaround, which is to use explicit sessions for all operations that need to use read concern "linearizable" or "available" and set causalConsistency=false on the session.
Is this ticket required by a downstream team?
No.
Is this ticket only for tests?
No.
- related to
-
GODRIVER-2478 Create implicit sessions with "causalConsistency=false"
- Closed
- split to
-
CDRIVER-4431 Disable causal consistency in implicit sessions
- Closed
-
CSHARP-4262 Disable causal consistency in implicit sessions
- Closed
-
CXX-2548 Disable causal consistency in implicit sessions
- Closed
-
GODRIVER-2497 Disable causal consistency in implicit sessions
- Closed
-
MOTOR-997 Disable causal consistency in implicit sessions
- Closed
-
NODE-4447 Disable causal consistency in implicit sessions
- Closed
-
PHPLIB-915 Disable causal consistency in implicit sessions
- Closed
-
PYTHON-3360 Disable causal consistency in implicit sessions
- Closed
-
RUBY-3058 Disable causal consistency in implicit sessions
- Closed
-
RUST-1414 Disable causal consistency in implicit sessions
- Closed
-
JAVA-4681 Disable causal consistency in implicit sessions
- Closed