Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
None

Assigned Teams:

Cluster Scalability
Sprint:
Sharding 2018-10-22
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The server selection algorithm randomly directs reads to different servers which causes blocking if the reads are causally dependent and the selected server has yet to apply the operations depended upon.

Example: a web-application can tolerate some staleness but must have predictably fast response times. It reads some state (RS) which is predicated upon by the next read (R). The server selected for RS (S1) could also respond to R without delay but instead a different server (S2) is selected at random whose replication lags behind S1 and the response is blocked until S2 catches up.

"Pinning" for clients has been deprecated but client applications that need to distribute reads to secondaries and have predictable latencies under causal consistency could benefit from shorter-lived pinning in sessions.

"Session pinning" can be achieved by the future work identified in the Max Staleness specification (below).

"If a future spec allows applications to use readConcern "afterOptime" [also "afterClusterTime"], clients should prefer secondaries that have already replicated to that opTime, so reads do not block. This is an extension of the mongos logic for CSRS to applications."

Rather than pinning a client to a particular server, a session becomes pinned to a set of eligible servers that can respond equivalently without blocking.

Applications may need to consider that whilst starting a new session with no initial last optime (read-concern afterClusterTime) would allow selection from all servers regardless of staleness/lag, servers with the least replication lag may be selected disproportionately because they meet the after-operation-time criteria of more sessions.

is duplicated by

SERVER-36042 Server Selection Algorithm Causes Blocking with Causal Consistency

Closed

Assignee:: [DO NOT USE] Backlog - Cluster Scalability
Reporter:: Simon Yarde
Participants:: [DO NOT USE] Backlog - Cluster Scalability, Misha Tyulenev, Simon Yarde
Votes:: 0 Vote for this issue
Watchers:: 15 Start watching this issue

Created:: Jul 03 2018 01:18:02 PM UTC
Updated:: Nov 17 2023 10:23:18 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates