-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Engines, Storage Engines - Persistence
-
None
-
None
In __wti_block_disagg_write, bytes_total is incremented immediately after the page is successfully written to SLS. If address packing subsequently fails, the error returns with no cleanup - bytes_total has been incremented but is never rolled back.
The error propagates up through reconciliation (_rec_split_write) where the cleanup path (rec_write_err) checks for a block cookie before freeing the written block. Since address packing failed before the cookie was ever set, the check is false and no decrement occurs. Because_wt_memdup only executes if addr_pack succeeds, a failure here guarantees the cookie is never set. The reconciliation cleanup path is gated on the cookie existing, so it can never roll back an addr_pack failure - the decrement must happen in the block manager, at the point of failure, before the error is returned.
The result is permanent bytes_total inflation: the orphaned page is eventually GC'd by the page server, but WT has no mechanism to follow suit, and every subsequent checkpoint size is overstated by the size of the orphaned write.
- related to
-
WT-16660 bytes_total increment not protected by reconciliation panic boundary
-
- Open
-