Track b-tree size using the aggregrate size on the address cookie.

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Btree
    • None
    • Storage Engines, Storage Engines - Persistence
    • SE Persistence - 2026-01-30
    • None

      In order to know the full database size we need to be able to sum b-tree sizes. In this change we'll aggregate the size of all the cookies in the b-tree and save that information to the checkpoint metadata for the tree.

       

      To implement this:

      1. Add a bytes_total field to the __wt_btree structure.
      2. When a block is written via __wti_block_disagg_write_internal update the total size of the b-tree.
      3. When a block is free'd / discarded decrease the total via: __wti_block_disagg_page_discard.
      4. When a checkpoint is taken update the ckpt->size field, in meta_ckpt.c::__wt_meta_ckptlist_set.

      Item 4 is effectively this patch:

              if (F_ISSET(ckpt, WT_CKPT_ADD)) {
                  ckpt->next_page_id = btree->next_page_id;
                  /* For disaggregated storage, save the current total compressed bytes to ckpt->size. */
                  if (F_ISSET(btree, WT_BTREE_DISAGGREGATED)) {
                      ckpt->size = __wt_atomic_load_uint64_relaxed(&btree->bytes_compressed_total);
                  }
              } 

      Write a unit test to validate that the b-tree size is being written out via checkpoint. 

      Hint: This is test_disagg_checkpoint_size01.py on the PoC.

      Scope:

      1. Track b-tree level database size.
      2. Write a test.

            Assignee:
            Mariam Mojid
            Reporter:
            Luke Pearson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: