We do not kill user or internal reads on step down, similarly to how we do not kill internal writes on step down. Thus if they hit a prepare conflict they will cause the deadlock described in
- Stepdown and step up will now require RSTL in mode S.
- Rollback requires RSTL in mode X.
- Reads take RSTL in mode IS.
- Writes take RSTL in mode IX.
- Operations would need to commit to whether they can write when they acquire a lock, but this is acceptable (and is essentially already the contract).
- We don’t plan to implement upgrading locks. If something wants to "upgrade" its locks, it must drop all of its locks and reacquire them and make sure that's safe for its own purposes.
- Implement this by taking the same lock mode for the RSTL that you take for the global lock.
- Advantage that stepdown doesn’t need to wait for reads to complete/yield and yielded readers don’t need to wait for stepdown to complete.
- This fixes the problem for any operation that acquires a global IS lock (user or internal) since stepdown will no longer block on the operation to complete.
- User operations that acquire global S, IX, or X locks are already killed on stepdown so aren't a problem.
- Internal operations that acquire global S, IX, or X locks on user data still must be explicitly killed, so the RangeDeleter, TTL, and any other internal writers to user data must still be audited and fixed.
The S mode acquisition on step up and step down means concurrent state transitions could start happening. To protect against this we will add a new LockManager ResourceMutex that the ReplicationStateTransationLockGuard acquires after acquiring the RSTL, and releases before it. This should be a straightforward way to allow reads and step-ups/step-downs to not conflict (via S and IS locks) but for step-ups and step-downs to conflict even though they take S locks (via the ResourceMutex that does not interact with reads at all).