(Follower mode) harden fast truncate truncate‑list implementation

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Truncate
    • None
    • Storage Engines - Foundations
    • SE Foundations - 2026-04-10
    • 5

      Problem
      Currently, slow truncate is enabled on both primaries and standbys, but it cannot keep up with large truncate workloads. This can cause significant lag on standby nodes relative to the primary, to the point where standbys may be unable to serve reads effectively, introducing an availability risk. The existing ASC fast truncate implementation cannot be used on standbys because it writes fast‑truncate metadata directly to the standby table, violating the core invariant that standby mode must not perform updates/inserts on the stable table.

      To address this, we are adopting a new standby fast truncate design that uses an in‑memory truncate list to track truncate ranges on standbys. After WT‑15207, a simple POC truncate list exists on develop and provides a basic working fast truncate path for standby nodes, but this implementation is still incomplete: the data structure and APIs are not yet final, there are multiple FIXMEs, and the truncate‑list create/destroy routines do not have dedicated unit tests.

      As a result, we do not yet know whether the current truncate list:

      • Reliably captures and maintains truncate ranges for standby fast truncate under real workloads
      • Preserves standby invariants and avoids correctness bugs or leaks in its lifecycle
      • Provides a stable foundation that future fast truncate work on standby nodes can safely build on

      This ticket closes that gap by validating and hardening the standby fast truncate truncate‑list implementation: reviewing and finalising the POC truncate‑list structure and behaviour, resolving the existing FIXMEs, and adding unit tests for truncate‑list creation/destruction (Other core operations will done in other tickets) so that regressions are caught early and the standby fast truncate design has a solid, tested data structure to rely on.

            Assignee:
            Jie Chen
            Reporter:
            Jie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: