Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-10227

Unnecessary deleted page instantiations

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • StorEng - Refinement Pipeline

      Summary
      One of the common causes of needing to instantiate a deleted page (that is, a page gets fast-truncated and then we read it in and populate it with tombstones) is that cursor search will unconditionally read it in order to return WT_NOTFOUND. This is unavoidable if the cursor is being positioned in order to write; however, for read it seems like we ought to be able to detect that the page is deleted and return WT_NOTFOUND directly without reading it in. (At least in some cases, perhaps, where the visibility is straightforward; if things get complicated it's fine to fall back to instantiation and the ordinary read code.)

      (Note that cursor_next and cursor_prev will skip over deleted pages; but explicit search does not.)

      This is not entirely trivial and might turn out to be more impossible than I thought, but it potentially allows saving a fair amount of work and is therefore worth considering.

      Motivation

      • Does this affect any team outside of WT?
        No.
      • How likely is it that this use case or problem will occur?
        It seemed to me that it happens a fair amount, but that was while I was working on truncate so might be skewed.
      • If the problem does occur, what are the consequences and how severe are they?
        Entirely performance (and I/O bandwidth).
      • Is this issue urgent?
        No.

      Acceptance Criteria (Definition of Done)
      Either it is possible for cursor->search for a key on a deleted page to return WT_NOTFOUND without reading the page in, possibly only under circumstances where this is not unduly complicated... or the conclusion is that trying to do this is not worthwhile.

      • Testing
        Nothing particularly special though it would probably be good to create a Python test that specifically sets up the scenario and uses stats counters to make sure it works as intended. The stats already exist.
      • Documentation update
        The fast-truncate architecture guide page is sufficiently detailed that updating it is probably desirable. I hesitate to create that ticket though until someone decides the change itself is worth pursuing.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            dholland+wt@sauclovia.org David Holland
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: