-
Type:
Improvement
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
RSS Sydney
-
Fully Compatible
-
PastaLaVista - 2025-03-18, pro-duck-tive - 2025-04-01, meow meow meow - 2025-04-15
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The current strategy tries to kill the oldest session once a second under cache pressure.
This strategy was suggested by WT eng on WT-14075:
The space taken by transactions consists of two parts:
1. Exclusively used by the transaction (e.g., a specific update).
2. Shared with other transactions (e.g., the page where the update resides).Part (1) is tracked relatively precisely and instantly - its size can be queried via per-session stats, allowing us to estimate how much data will be freed up after a rollback.
Part (2) is more complex - it’s not directly tracked, and its release happens asynchronously through eviction.
While I don’t have a complete, robust solution, I’d suggest considering a dual-logic strategy:
1. Start rolling back transactions, estimating how much space is freed via updates.
2. Break the loop if either of these thresholds is met:The amount of freed-up update space surpasses a set limit.
A fraction of transactions have been rolled back per iteration (e.g., 5% of total uncommitted transactions or a minimum of 10 transactions).
Introduce a reasonable delay between rollback attempts to allow eviction to catch up.
This strategy should handle two key edge cases and their combinations): (1) Most of the occupied space comes from exclusive transaction parts (updates). (2) Updates take up a small fraction of space, with most being held in pages.
- related to
-
WT-14075 Investigate possible new metrics that indicate the necessity to stop long running transactions.
-
- Closed
-
- split from
-
SERVER-101817 Update underCachePressure to query updated WT stats
-
- In Progress
-
- split to
-
SERVER-102762 Determine suitable defaults for cache-pressure-eviction parameters
-
- Backlog
-