Add table corruptions detection test cases for DisAgg tables verification

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Verify
    • None
    • Storage Engines, Storage Engines - Foundations, Storage Engines - Persistence
    • SE Foundations - Q3+ Backlog
    • 8

      We already have unit tests to check whether our verification can detect corruptions, but they currently work only for regular tables, as they rely on writing invalid data chains to local files.

      This approach doesn’t work for DisAgg shared tables since they are stored in PALS, meaning no data is stored locally.

      I currently have two ideas for how to implement this test case, both based on calling the PALM interface during testing, locating the page that should be corrupted, and overwriting it with invalid content (e.g., filling it with zeroes).

      The first approach is to use the PALM Python wrapper. A preliminary algorithm might look like this:

      1. Read the metadata to extract the table_id.
      1. Use this table_id to open the PL dhandle (via pl_open_handle()).
      1. Open the checkpoint (possibly via pl_get_open_checkpoint()?).
      1. The unclear part: use plh->put() and plh->get() to somehow overwrite the targeted page with corrupted content.

      The second approach is similar but involves writing a custom public C function in PALM that accepts a table_id and a page identifier, and then overwrites the specified page with the given content. This feels a bit more intrusive, as it requires exposing API functionality purely for testing purposes.

              Assignee:
              [DO NOT USE] Backlog - Storage Engines Team
              Reporter:
              Ivan Kochin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: