Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- sharding-nyc-subteam3

Assigned Teams:

Cluster Scalability
Sprint:
Sharding NYC 2023-09-18, Sharding NYC 2023-10-02, Sharding NYC 2023-10-16, Sharding NYC 2023-10-30
Linked BF Score:
113
Story Points:
3
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In the current version of the code we have retry loops with no backoff and asynchronous replica set monitor failure notification. This creates the scenario where a request can fail, the calling thread calls failedHost on the RSM, and the retry loop then immediately tries another request. This will happen within the span of microseconds, and the next attempt may result in the same failure due to not enough time passing.

This ticket is to improve this behavior by blocking the targeter when an error occurs, such as NotPrimary or InterruptedDueToReplStateChange (list not exhaustive), such that we return from the method that reports the failure to the RSM once the getHostsOrRefresh request of the RSM will return a different result (or a timeout occurs).

related to

SERVER-50342 Make version of Shard::runCommand that returns a future

Open

Assignee:: Unassigned
Reporter:: Lamont Nelson
Participants:: Lamont Nelson
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Jun 29 2023 10:57:29 PM UTC
Updated:: Apr 22 2024 04:33:02 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates