Incorrect comparison between local table and metadata checkpoint orders during pruning

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Critical - P2
    • None
    • Affects Version/s: None
    • Component/s: Layered Tables
    • None
    • Storage Engines, Storage Engines - Foundations
    • SE Foundations - 2025-10-10
    • 8

      Currently we we try to find the new prune_timestamp during the checkpoint pick up process we use global conn->disaggregated_storage->ds_track structure to find the oldest checkpoint which is currently in use. The timestamp from this checkpoint is then used as the new prune_timestamp value.

      This ds_track structure contains two fields: order and timestamp which we track every time we do a checkpoint pickup. Order is number that is followed by the "WiredTigerCheckpoint" for the last checkpoint for the metadata table. 

      During the process of searching for the last checkpoint we compare this metadata table checkpoint order with the current table checkpoint order (we obtain it through __layered_last_checkpoint_order()) which is incorrect since the checkpoint order is a thing that is local for every table and they are not related.

      Please see WT-15158 for more details about how it was discovered.

      It does require some investigating some long-term solutions, currently options that we have on mind are:

      • Using some global database wide checkpoint ID for that purpose (OpLog checkpoint LSN might be a good candidate)
      • Track checkpoints in use locally for every table that could increase our memory consumption, but could help us to define pruning_timestamp faster and more accurately 

        1. out
          2.41 MB
        2. test_copy.js
          3 kB

            Assignee:
            Ivan Kochin
            Reporter:
            Ivan Kochin
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: