TLA+ spec for kNotIdempotent retry criteria + minimum-safe NotPrimary subset (SERVER-108482)

    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      TLA+ spec for SERVER-108482 (kNotIdempotent retry criteria too permissive; prior fix auto-reverted). Identifies the minimum-safe NotPrimary subset that disambiguates rejected-before-apply from after-apply failure modes.

      Files

      • src/mongo/tla_plus/Networking/KNotIdempotentRetry/KNotIdempotentRetry.tla (289 lines) — classifier's three axes: RequestState × NotPrimarySub × IdemClass. Bug toggle AllowRetryOnAmbiguousNotPrimary.
      • MCKNotIdempotentRetry.tla / .cfg (green, =FALSE)
      • MCKNotIdempotentRetry-bug.cfg (bug, =TRUE)
      • OWNERS.yml → 10gen/server-catalog-and-routing
      • Sibling Networking/OWNERS.yml so the new dir has parent ownership.
      • README.md (316 words)

      TLC verdicts

      • Green: 13724 states generated, 4971 distinct, depth 15, no error.
      • Bug: Invariant NoDoubleApplication is violated; 5-step counterexample NonIdempotentNonWrite → AppliesThenFails(PrimarySteppedDown, applied=1) → DispatcherRetries → AppliesThenFails (applied=2).

      Minimum-safe NotPrimary subset (load-bearing finding)

      Retry-safe subset: {NotWritablePrimary, NotPrimaryNoSecondaryOk} — rejected-before-apply codes.

      Excluded: {PrimarySteppedDown, InterruptedDueToReplStateChange} — these can fire after the apply path is entered and need NoWritesPerformed (SERVER-66479) to disambiguate.

      This is why both #52940 (drop everything) and the pre-#52940 status quo (retry everything) are wrong; safe behavior is rejected-before-apply pair only.

            Assignee:
            Unassigned
            Reporter:
            Mehar Grewal
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: