-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Test Python
-
None
-
Storage Engines - Foundations
-
67.462
-
None
-
None
Fast truncate marks whole leaf pages deleted without materializing them, so it has no per-page cache footprint and commits cleanly even under a tight cache. Slow truncate falls back to per-record cursor->remove(), which pulls every page in the truncation range into cache. Under cache pressure — particularly when a long-running reader pins original page images and prevents eviction — slow truncate fills cache faster than eviction can drain it, and the transaction is rolled back with WT_ROLLBACK.
On disaggregated layered tables this affects both sides:
- Leader (debug_mode=(slow_truncate=true)): hits the btree path
- Follower (debug_mode=(disagg_slow_truncate_follower=true)): hits the layered cursor ingest path
In production a long-running secondary read on the follower is sufficient to pin pages and turn a slow-path truncate into a rollback storm. Fast truncate has no such failure mode under identical configuration.
Reproduction
test/suite/test_layered_fast_truncate21.py — cross-product of
{leader, follower}×
{slow, fast}:
| Scenario | Result |
|---|---|
| leader + slow truncate | WT_ROLLBACK |
| leader + fast truncate | commits |
| follower + slow truncate | WT_ROLLBACK |
| follower + fast truncate | commits |
Config: layered table, 10 MB cache, eviction dirty/updates trigger=100, ~180 MB user data (350 K rows × 512 B), snapshot reader pinning original page images during truncate. Runs in ~15 s.
See also: WT-17650 (introduced slow_truncate / disagg_slow_truncate_follower debug knobs)