Session Pinning / Server Selection Algorithm Causes Blocking with Causal Consistency

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Cluster Scalability
    • Sharding 2018-10-22
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      The server selection algorithm randomly directs reads to different servers which causes blocking if the reads are causally dependent and the selected server has yet to apply the operations depended upon.

      Example: a web-application can tolerate some staleness but must have predictably fast response times. It reads some state (RS) which is predicated upon by the next read (R). The server selected for RS (S1) could also respond to R without delay but instead a different server (S2) is selected at random whose replication lags behind S1 and the response is blocked until S2 catches up.

      "Pinning" for clients has been deprecated but client applications that need to distribute reads to secondaries and have predictable latencies under causal consistency could benefit from shorter-lived pinning in sessions.

      "Session pinning" can be achieved by the future work identified in the Max Staleness specification (below).

      "If a future spec allows applications to use readConcern "afterOptime" [also "afterClusterTime"], clients should prefer secondaries that have already replicated to that opTime, so reads do not block. This is an extension of the mongos logic for CSRS to applications."

      Rather than pinning a client to a particular server, a session becomes pinned to a set of eligible servers that can respond equivalently without blocking.

      Applications may need to consider that whilst starting a new session with no initial last optime (read-concern afterClusterTime) would allow selection from all servers regardless of staleness/lag, servers with the least replication lag may be selected disproportionately because they meet the after-operation-time criteria of more sessions.

            Assignee:
            [DO NOT USE] Backlog - Cluster Scalability
            Reporter:
            Simon Yarde
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: