Build a file-backed block cache

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Block Cache
    • Storage Engines, Storage Engines - Persistence
    • SE Persistence backlog
    • None

      The block cache has currently two modes of operation, DRAM and NVRAM. We should add a third mode: A cache backed by a temporary file. We need this to ensure that we can use a larger-than-memory cache for disaggregated storage. A DRAM-based block cache might not be sufficient, because we would still need to leave a large amount of DRAM unused, so that if mongod performs another memory allocation, it would not result in OOM.

      A file-based block cache would most likely look like our current NVRAM-based solution, using a library similar to libmemkind to create a memory pool and then allocating memory from it. To initialize the memory pool, we could do the following at least on Unix-based systems (we probably wouldn't need to support this mode of block cache on other platforms):

      1. Create a temporary file and extend it to the appropriate size by writing to an offset at the end of the range, or possibly via fallocate or a similar call.
      2. Memory-map the temporary file.
      3. Possibly unlink the file, so that it could be cleaned up automatically.

      If possible, we should see if there is a way to instruct the OS to not write back those pages to local storage more than necessary. We should also see if there is a way for the OS to not write back any pages to disk when shutting down, as that would be completely unnecessary (a quick glance at the madvise MADV_FREE flag seems promising, but we still need to ensure it does what we actually need here).

      At the same time—either in this ticket, or in a separate ticket worked on before this ticket—we should check about what having a persistent cache means from the security perspective. If Atlas allows memory swapping, maybe this would be no different. Or maybe we would be allowed to write pages to the cache only if they are encrypted (and what to do about it in case encryption would be done outside of WT).

              Assignee:
              [DO NOT USE] Backlog - Storage Engines Team
              Reporter:
              Etienne Petrel
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: