Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4111

Improve checkpoint scrubbing algorithm

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.6.9, 4.0.1, 4.1.1, WT3.2.0
    • Component/s: None
    • Labels:
    • Case:
    • Story Points:
      3
    • Sprint:
      Storage Non-NYC 2018-07-02
    • Backport Requested:
      v4.0, v3.6

      Description

      The intent of the scrub phase of checkpoints was to reduce how much content a checkpoint needs to flush to disk. Scrubbing has many different workloads to deal with, but two extremes define the bounds the algorithm needs to work within.

      1) Large caches with low volume of writes. With this type of workload pre-flushing dirty content makes checkpoints much less disruptive to other operations. It's the primary target for scrubbing.

      2) High throughput write workloads. In this case there is little hope of reducing the amount of content a checkpoint needs to write, since the workload re-dirties data much faster than it can be flushed to disk. In this situation scrubbing isn't useful - we've seen applications spend significant time scrubbing (and suffering reduced workload throughput as a consequence), without the scrubbing materially reducing the amount of work the checkpoint does. We have a "give up on scrubbing" heuristic to handle this case, and it's effective in extreme cases. Getting the heuristic correct has been difficult.

      Given the above it is interesting to change how scrubbing works. At the moment it steps down the eviction target (maximum threshold) to a lower value - which co-opts threads into eviction work.

      We could instead look for workloads similar to 1) by reducing the eviction trigger (lower bound), and measuring whether scrubbing makes progress at reducing the cache usage.

      We suppose that such a change will remain effective with workloads like 1) while significantly reducing the cost of scrubbing for workloads like 2.
       

        Attachments

          Activity

            People

            Assignee:
            michael.cahill Michael Cahill
            Reporter:
            alexander.gorrod Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: