Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9812

Prototype a chunk cache

    • Type: Icon: New Feature New Feature
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT11.2.0, 7.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None

      Summary
      This project will prototype a chunk cache for remote objects.

      Motivation
      When data lives in slow remote storage, such as S3, local caching has obvious advantages. WiredTiger has two ways of caching data: (1) in a block cache – implemented in the WiredTiger core, and (2) in a tier cache, implemented in dir_store. Neither of these implementations is ideal.

      The block cache caches WiredTiger storage blocks, which are typically 4K-24K in size, depending on the configured allocation unit. The current architecture does not allow the block cache to cache data at granularities other than block size. We experimentally showed that reading data from remote tiers in block granularity has substantially worse performance than reading it in larger chunks, e.g., the entire object.

      The tier cache, implemented in dir_store, assumes that we have unlimited local space for caching tiers – not a realistic assumption. Furthermore, caching entire tiered objects means we are caching the useless data in them as well as the useful stuff. The latency of servicing a cache miss goes up if we have to read a potentially huge amount of data from S3 before we can retrieve the block we need. The size of tiered objects could vary a lot. If we wind up with a single object that is, say 1TB, reading it into the existing dir_store cache could displace everything else we have cached. If our local disk is smaller than the size of a single object, we wind up with an object that we cannot cache, and thus cannot access.

      How likely is it that this use case or problem will occur?
      This will most certainly become an issue once tiered storage is deployed.

      • If the problem does occur, what are the consequences and how severe are they?

      Deploying tiered storage without a properly functioning cache will mean (at best) that access latency could be slow or unreliable and (at worst) that we will fail on reading a remote object (as explained above).

      Suggested Solution
      We will implement a cache tier that caches data in chunks of configurable size. Block manager can be used in addition to the chunk cache, if needed. 

      This document has a bit more content and discussion: https://docs.google.com/document/d/1n6WXGUuelp-q02dW1mpz1sZ9QMoma_j2G3PgOoTHSGg/edit?usp=sharing

            Assignee:
            sasha.fedorova@mongodb.com Sasha Fedorova
            Reporter:
            sasha.fedorova@mongodb.com Sasha Fedorova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: