Resolve missed disagg reconciliation cases that result in btree size accounting being incorrect.

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • WT12.0.0
    • Affects Version/s: None
    • Component/s: Checkpoints
    • None

      While writing python testing for the checkpoint size project we've found at least two cases where our account wasn't tracking page discards correctly. This comes back to the page id logic in disagg, wherein WiredTiger won't discard a page directly instead having the backend determine if the page is relevant any more.

      It seems like the primary fix is:

      diff --git a/src/reconcile/rec_write.c b/src/reconcile/rec_write.c
      index 704f450800c..4265d18f74f 100644
      --- a/src/reconcile/rec_write.c
      +++ b/src/reconcile/rec_write.c
      @@ -2881,6 +2881,9 @@ __rec_write_wrapup(WT_SESSION_IMPL *session, WTI_RECONCILE *r)
                   disagg_page_free_required =
                     (r->multi_next != 1 || r->multi->block_meta->page_id == WT_BLOCK_INVALID_PAGE_ID);
               WT_RET(__wt_ref_block_free(session, ref, disagg_page_free_required));
      +        /* Update the tree size accounting. */
      +        if (disagg_page_is_valid && !disagg_page_free_required && r->multi->block_meta->delta_count == 0)
      +            __wt_btree_decrease_size(session, ref->page->disagg_info->block_meta.cumulative_size);
               break;
           case WT_PM_REC_EMPTY: /* Page deleted */
               break;
      

      But it is unclear if further fixes around the disagg_page_free_required variable are needed. This ticket tracks the investigation work around that and will also be where we merge the fixes.

            Assignee:
            Luke Pearson
            Reporter:
            Luke Pearson
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: