Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32883

Enhanced FSM testing for reading from secondaries

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v4.0, v3.6
    • Sprint:
      TIG 2018-05-07, Storage NYC 2018-05-07, Storage NYC 2018-05-21, Storage NYC 2018-06-04
    • Story Points:
      8

      Description

      1. Change the secondary_reads_passthrough.yml test suite which was added as part of SERVER-34384 to use the "forceSyncSourceCandidate" failpoint as a server parameter to force secondary #2 to sync from secondary #1.

      2. Add a new version of the concurrency_replication.yml test suite that uses a 5-node replica set with each secondary syncing in succession of each other (i.e. a linear chain), writeConcern={w: 1}, readConcern={level: "local", afterClusterTime: ...}, and readPreference={mode: "secondary"}. We'll also likely want to make a wrapper around a Mongo connection object to the primary and to a specific secondary so that an individual worker thread talks to a particular secondary all the time rather than some secondaries potentially never being read from.

      I think there's some additional complexity here because we want FSM worker thread to do reads from different secondary. (We'll probably pin it to a particular secondary similar to how we "round-robin" when using multiple mongos processes.) It seems like we'll want to have a Mongo connection object implemented in JavaScript that for commands which are present in this list are routed via a direct connection to the secondary and commands not present in that list are routed via a direct connection to the primary. I think the existing "connection cache" in the concurrency framework makes it relatively straightforward to have direct connections to other nodes in the cluster.

      In creating this wrapper around two separate Mongo connection objects, we may also want to change how SERVER-34383 was implemented to construct a wrapper around a secondary's connection from the connection cache instead of creating a replica set connection for the worker thread.

      Original description

      As part of SERVER-32606 it turned out that our testing of tailing the oplog on secondaries, including the case of chained replication, is light, while the code paths for secondary reads have gotten quite different now from reads on primaries.

      We should have a passthrough test where we test these behaviors. This is related to SERVER-32606, but was too big a task to do as part of that ticket.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              xiangyu.yao Xiangyu Yao (Inactive)
              Reporter:
              geert.bosch Geert Bosch
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: