[SERVER-32837] Run jepsen with 2 nodes and an arbiter on branches and storage engines that don't support recoverable rollback Created: 22/Jan/18 Updated: 06/Dec/22 Resolved: 05/Nov/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Backlog - Server Tooling and Methods (STM) (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | stm, tig-jepsen | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Server Tooling & Methods
|
| Participants: |
| Description |
|
This should eliminate the case of two nodes both going into rollback. |
| Comments |
| Comment by Max Hirschhorn [ 04/Jun/18 ] |
|
judah.schvimer, spencer, based on this JQL it doesn't seem like there have been many additional failures on the 3.4 or 3.6 branches that would require the changes from PM-842 to address. I'm inclined to just let this SERVER ticket sit on the backlog as BF-5658 isn't causing a lot of pain and we aren't sure that using an arbiter provides the same guarantees. |
| Comment by Spencer Brody (Inactive) [ 22/Jan/18 ] |
|
Hmm... I suspect this may cause other issues with jepsen, as PSA sets don't have the same liveness guarantees that PSS sets do. For instance if jepsen takes down a secondary, it may then try to do a w:majority write that will time out. With PSS you know that if you have a primary you can satisfy w:majority writes, with PSA you don't get that guarantee. I suspect jepsen may rely on that guarantee in order to pass. |