Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85252

Run Jepsen list-append as a Resmoke workload

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • None
    • None
    • None
    • Replication

    Description

      Jepsen is written in Clojure and runs on its own infrastructure. Very few engineers are familiar with either of these and consequently even minor changes to workloads (like DEVPROD-2763) or infrastructure (SERVER-79167) take weeks to implement.

      We can get around this problem by rewriting Jepsen workloads in Javascript so that Resmoke can run the workloads instead. Most Jepsen workloads are simple (but clever), so rewriting them in Javascript shouldn't be very time consuming.

      Roughly, the steps involved to make this happen:
      1. Rewrite Jepsen workloads in Javascript
      2. Implement infrastructure to feed the Jepsen workload output into Elle, Jepsen's model checker. This could be as simple as prefixing all log lines from the workload and then filtering them out.
      3. Integrate Elle into Resmoke via hooks - for example, we can have a hook that launches this Elle CLI - that processes the filtered output.

      A few benefits of doing this:

      • Server engineers can write / debug in an environment they're familiar with
      • We can extend Jepsen workloads through overrides. For example, we can pretty easily come up with a timeseries version of list-append
      • We can run Jepsen workloads under the various fault injection modes / background processes by adding hooks. For example, we can run Jepsen + tenant migrations, Jepsen + initial sync, etc.
      • We can run Jepsen workloads on Antithesis

      Attachments

        Activity

          People

            backlog-server-repl Backlog - Replication Team
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: