Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85252

Run Jepsen list-append as a Resmoke workload

    • Type: Icon: Task Task
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Replication

      Jepsen is written in Clojure and runs on its own infrastructure. Very few engineers are familiar with either of these and consequently even minor changes to workloads (like DEVPROD-2763) or infrastructure (SERVER-79167) take weeks to implement.

      We can get around this problem by rewriting Jepsen workloads in Javascript so that Resmoke can run the workloads instead. Most Jepsen workloads are simple (but clever), so rewriting them in Javascript shouldn't be very time consuming.

      Roughly, the steps involved to make this happen:
      1. Rewrite Jepsen workloads in Javascript
      2. Implement infrastructure to feed the Jepsen workload output into Elle, Jepsen's model checker. This could be as simple as prefixing all log lines from the workload and then filtering them out.
      3. Integrate Elle into Resmoke via hooks - for example, we can have a hook that launches this Elle CLI - that processes the filtered output.

      A few benefits of doing this:

      • Server engineers can write / debug in an environment they're familiar with
      • We can extend Jepsen workloads through overrides. For example, we can pretty easily come up with a timeseries version of list-append
      • We can run Jepsen workloads under the various fault injection modes / background processes by adding hooks. For example, we can run Jepsen + tenant migrations, Jepsen + initial sync, etc.
      • We can run Jepsen workloads on Antithesis

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: