Loading...

XML

Word

Printable

JSON

Server Compat:
- 4.4
- 5.0
- 5.3
Quarter:
- FY24Q4
Story Points:
3
Upstream Changes Summary:
Hide

~~DRIVERS-1571~~:

Drivers should implement server selection and read/write retry mechanisms changes, as well as new prose tests: specifications@86d961f

2024-02-21: Drivers that have not yet completed this ticket should reference f5bb605 (DRIVERS-2828) for updated prose test specification.
Show
DRIVERS-1571 : Drivers should implement server selection and read/write retry mechanisms changes, as well as new prose tests: specifications@86d961f 2024-02-21: Drivers that have not yet completed this ticket should reference f5bb605 ( DRIVERS-2828 ) for updated prose test specification.
Compass/DevTools Changes:
Not Needed
Confidence Status:
None

Documentation Changes:
Not Needed
Documentation Changes Summary:

Hide

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?

Show
1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

There are several scenarios in which it would be useful to redirect reads or writes to a different mongos.

A MongoDB sharded cluster deployment may find itself in a situation when a mongos reports itself as being healthy but is unable to execute any queries. The driver has attempted to retry the failing queries, but in a number of cases selected the same mongos that failed in the first place which caused the retry to also fail (for the same reason as the original attempt) and be propagated to the application.
Currently when the driver is in sharded topology, server selection spec requires a random server to be selected for each operation. This permits the same failed mongos to be selected for both an operation and its retry, with the result that the query fails, even when there are healthy mongoses in the deployment that could have successfully executed the query.

The suggested improvement is for the driver, when in sharded cluster topology, to:

Track whether a server selection request is for the first attempt or for a retry,
Track the server used for the first attempt,
When selecting the server for the retry, if there are multiple eligible mongoses, select randomly from mongoses other than the one used for the first attempt.
bonus nice to have: determine if a mongos is healthy before making said attempt and if unhealthy, exclude from selection

Acceptance Criteria:

Implementation:

Update server selection to handle deprioritised servers and to not select from them when the topology is sharded if other servers are present.
When no other servers are present a deprioritised server must be selected.
When retrying a read or write set the previous selected server and pass it in the array of deprioritised servers to server selection.

Testing:

Unit Tests

Prose tests

Two new tests for retryable writes
- Test that in a sharded cluster writes are retried on a different mongos if one available
- Test that in a sharded cluster writes are retried on the same mongos if no other is available
Two new tests for retryable reads
- Retryable Reads Are Retried on a Different mongos if One is Available
- Retryable Reads Are Retried on the Same mongos if No Others are Available

Wont Do:

Determining if the mongos is healthy as it's not defined in the spec what that means

is related to

NODE-5905 Update prose tests for mongos deprioritization during retryable ops

split from

DRIVERS-1571 Direct read/write retries to another mongos if possible