-
Type: Spec Change
-
Resolution: Unresolved
-
Priority: Unknown
-
None
-
Component/s: CSOT
-
Needed
-
Summary
There was a discussion in the CSOT channel regarding maxAwaitTimeMS adjustment by remaining timeout with RTT.
The current spec elaborates on incorporating the server's min RTT into the calculation of maxTimeMS for commands to prevent operations from being sent to the server if they're likely to fail due to timeout constraints, which aims to reduce connection churn. However, this part of the document is focused on the calculation of maxTimeMS for general command execution rather than specifically adjusting maxAwaitTimeMS for cursor operations.
Specifically, there is a lack of clarity in the current specification on whether maxAwaitTimeMS should be adjusted by the remaining timeoutMS - RTT in command execution scenarios, particularly affecting tailable awaitData cursors. This omission could lead to connection churn in drivers.
For example, if maxAwaitTimeMS is set to 9ms, timeoutMS to 10ms, and RTT is 2ms, this configuration might result in getMore command timing out when no data is available, leading to connection closure on the driver's side. Moreover, the next() method on a tailable cursor executes getMore in a loop until data becomes available or the timeout expires. When the remaining timeoutMS - RTT becomes less than maxAwaitTimeMS, sending the original maxAwaitTimeMS becomes not necessary. Instead, sending the remaining timeoutMS - RTT should be considered as maxAwaitTimeMS, allowing the server an opportunity to respond with an empty batch. In the subsequent next iteration, the timeout is likely to expire before sending another request, thus avoiding connection closure.
This ticket seeks to address and clarify the specification to ensure consistent behaviour within drivers.
Motivation
Who is the affected end user?
Drivers engineers implementing CSOT, users who may see encounter connection closures when no data available for a tailableAwait cursor.
How does this affect the end user?
Lack of clarity may cause discrepancies across driver implementations.
How likely is it that this problem or use case will occur?
Likely in scenarios involving tailable await cursors.
If the problem does occur, what are the consequences and how severe are they?
Potential for increased connection churn, impacting users' application performance and reliability.
Is this issue urgent?
Moderate urgency.
Is this ticket required by a downstream team?
No
Is this ticket only for tests?
No
Acceptance Criteria
- The specification has to explicitly state whether and how maxAwaitTimeMS should be adjusted in relation to timeoutMS - RTT for tailable awaitData cursors.
- depends on
-
DRIVERS-2884 CSOT avoid connection churn when operations timeout
- Investigating
- related to
-
DRIVERS-2884 CSOT avoid connection churn when operations timeout
- Investigating
- split to
-
CDRIVER-5837 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
CSHARP-5441 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
CXX-3200 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
GODRIVER-3444 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
JAVA-5720 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
MOTOR-1418 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
NODE-6621 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
PHPLIB-1601 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
PYTHON-5012 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
RUBY-3603 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked
-
RUST-2123 Clarify maxAwaitTimeMS adjustment by timeoutMS and RTT
- Blocked