Improve IO parallelism of ingest drain

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Layered Tables
    • None
    • Storage Engines - Foundations
    • 202.408
    • None
    • None

      It is important for failover performance and availablity that ingest table drain is fast.

      Currently we parallelize this work by having different threads drain each ingest table. But on a system with a small number of large tables, this doesn't help much as performance reduced to the time for a single thread to walk and drain the largest ingest table. If the required stable table pages aren't in cache, this bottlenecks on serial IO requests.

      The goal of this ticket (which maybe should be a small/medium project) is to provide further parallelization so that ingest drain can read for stable table pages into cache at close to network bandwidth.

      WT-17325 demonstrated that this level of performance improvement is possible. But its approach – using prefetch to load entire tables into cache – isn't viable in the general case where the ingest content may only require a fraction of the stable table.

      A few possible approaches come to mind:

      • Subdivide the keyspace of a table and have separate threads drain each key range. Challenges here include:
        • Uneven distribution of keys
        • When we start doing ingest drain asynchronously, we'd have to be careful to make sure that new keys don't get inserted in between ranges in a way that results in them not getting copied to the stable table.
      • Prefetch the pages. Have a thread walk through the ingest table and prefetch each of the corresponding stable table pages. This could leverage the existing prefetch infrastructure of work threads that take prefetch requests off a queue, but it would be populating that queue differently. Since there is no way to know which ingest keys are on the same stable table page, we'd probably have to issue prefetch requests or each key and have logic to return if the target page is already queued (or being read).

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: