Incorrect comparison between local table and metadata checkpoint orders during pruning

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Layered Tables
    • None
    • Storage Engines, Storage Engines - Foundations
    • SE Foundations - 2025-09-12
    • 8

      Currently we we try to find the new `prune_timestamp` during the checkpoint pick up process we use global `conn->disaggregated_storage->ds_track` structure to find the oldest checkpoint which is currently in use. The we use timestamp for this checkpoint as the new `prune_timestamp` value.

      This `ds_track` structure contains two fields: `order` and `timestamp` which we track every time we do a checkpoint pickup. Order is number that is followed by the 
      "WiredTigerCheckpoint" for the last checkpoint for the metadata table. 

      During the process of searching for the last checkpoint we compare this metadata table checkpoint order with the current table checkpoint order (we obtain it through `
      __layered_last_checkpoint_order()`) which is incorrect since the checkpoint order is a thing that is local for every table and they are not related.

      Please see WT-15158 for more details about how it was discovered.

      It does require some investigating some long-term solutions, currently options that we have on mind are:

      • Using some global database wide checkpoint ID for that purpose (OpLog checkpoint LSN might be a good candidate)
      • Track checkpoints inuse locally for every table that could increase our memory consumption, but could help us to define pruning_timestamp faster and more accurately 

              Assignee:
              [DO NOT USE] Backlog - Storage Engines Team
              Reporter:
              Ivan Kochin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: