-
Type:
Sub-task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: Test Python
-
Security Level: Public (Available to anyone on the web)
-
None
-
Storage Engines, Storage Engines - Persistence
-
0.003
-
SE Persistence backlog
-
None
Context
The Python suite has no helper for producing a corrupt disaggregated database. A test that wants to exercise a corrupt turtle, a corrupt metadata page, a corrupt leaf, a missing page, or a truncated delta chain has to drop into raw SQLite against palite's per-shard pages_<shard>.db files inline.
Palite holds an exclusive SQLite lock while the WT connection is open, so writes from a Python test must close the WT connection, open the shard DB read-write, run a single statement, and reopen WT. Writes also need to go through the sqlite3 binary built next to wt (wt_builddir) rather than the system binary, to avoid version skew with the SQLite statically linked into palite.
Palite's schema is fixed at ext/page_log/palite/palite.cpp:1510:
- Primary key (table_id, page_id, lsn).
- Payload column page_data BLOB.
- Flags column with bits WT_PAGE_LOG_DELTA = 0x2 and WT_PAGE_LOG_DISCARDED = 0x10000 (static-asserted at palite.cpp:1506-1507).
Existing helpers to reuse:
- get_shard_id at test/suite/helpers/helper_disagg.py:71 — maps table_id to shard.
- get_table_id at test/suite/helpers/metadata_helper.py:40 — maps URI to table_id.
Motivation
Without a shared helper, every author writing a corrupt-state Python test reinvents the same close-conn / sqlite-UPDATE / reopen-conn dance. The result is duplicated logic and tests that drift in how they target rows. A single mixin keeps every corruption helper consistent with palite's schema and lock semantics.
Out of scope
- Tests against any consuming wt subcommand. Those live with each subcommand's own ticket and call into this mixin.
- A non-palite implementation of the helpers.
Examples
Sketch of corrupt_page_image:
def corrupt_page_image(self, table_id, page_id, lsn=None): db = os.path.join(self.home, 'kv_home', f'pages_{get_shard_id(table_id):02d}.db') # Close the WT connection so palite releases its SQLite lock. self.close_conn() try: sql = ("UPDATE pages " "SET page_data = substr(page_data, 1, 0) || char(0xff) " " || substr(page_data, 2) " "WHERE table_id=? AND page_id=? " " AND lsn = COALESCE(?, " " (SELECT MAX(lsn) FROM pages " " WHERE table_id=? AND page_id=?))") subprocess.run([os.path.join(wt_builddir, 'sqlite3'), db, sql, str(table_id), str(page_id), '' if lsn is None else str(lsn), str(table_id), str(page_id)], check=True) finally: self.reopen_conn() return (page_id, lsn)