Optimise direct ingest creation to mitigate core startup bottlenecks

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Schema Management
    • None
    • Storage Engines - Foundations
    • 17.535
    • Storage Execution 2026-06-22
    • None

      Description:
      During the reconfigure(checkpoint_meta=...) step of a stateless follower node startup, picking up the checkpoint scales poorly. At 250,000 tables, startup takes roughly 27 minutes, with 99% of that time spent in __layered_create_missing_ingest_table.

      The sheer cost of table creation introduces three major bottlenecks per ingest table:

      • Dhandle Cache Insertion (_wt_session_get_dhandle)
      • Metadata Checkpoints (__wt_meta_track_off)
      • Config Management parsing overhead

      Objective:
      Investigate and mitigate the core bottlenecks (specifically dhandle cache insertion and local metadata checkpoints) during ingest creation. The goal is to determine if we can meet the Vertical Scaling SLOs (e.g., P95 of 90 seconds WT startup target for 110,370 tables) without requiring larger architectural changes.

      Definition of Done:

      • Core bottlenecks in __layered_create_missing_ingest_table are mitigated.
      • Benchmarks have been successfully run against the Vertical Scaling SLOs.
      • A definitive Go/No-Go decision is made regarding this approach for Sprint 2.

            Assignee:
            Wei Hu
            Reporter:
            Sid Mahajan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: