Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7788

Ideas on improving checkpoint cleanup

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Backlog
    • Component/s: None
    • Labels:
      None

      Description

      This ticket came out of an issue we saw with logkeeper, where the checkpoint cleanup was causing long checkpoints and stalls. Alex Gorrod, Haribabu Kommi and I identified areas of improvements, which are as follows:

      • Limit the amount of I/O generated by the history cleanup for checkpoints. This could be done by spreading the work across the checkpoints, instead of walking the whole tree each checkpoint. Potentially we could save the location of the walk and resume at the next checkpoint.
      • The internal pages the checkpoint reads for cleanup are also queued for urgent eviction. Since a checkpoint can load all the internal pages in a tree, the intention is to not thrash the cache with a non-working set. On the other hand, these pages need to be re-read again for the next checkpoint. We can evaluate whether it would make sense to not evict these pages and instead keep them around.
      • We could also add some heuristics for reducing the amount of work we need to do to find the obsolete content. Then instead of re-visiting each internal page at every checkpoint, these heuristics can guide us where to look.

      Alex Gorrod and Haribabu Kommi please feel free to add more or edit as you feel like.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-storage-engines Backlog - Storage Engines Team
              Reporter:
              sulabh.mahajan Sulabh Mahajan
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated: