-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: DHandles
-
None
-
Storage Engines - Foundations
-
None
-
None
It's unclear whether this might eventually be achieved with a Server ticket, or wholly within WT, or a combination. We'll discuss here before asking for any (minimal) Server work.
The issue, noticed during WT-16221, is file_manager.close_handle_minimum, defined with a default of 250. If the number of open btrees is at or below this number, the sweep will not mark any btrees with timeofdeath, and will not expire any btrees (cause them to be closed). To be clear, sweep stops walking the dhandle list when it notices this condition.
This can be a generally sensible policy to keep stuff around "just in case", although I don't think there's been much thought into how it is used within MongoDB. One can imagine a huge btree (or set of btrees) opened an hour ago, and not used since then, that remain open. Yeah, eviction will generally push the individual pages out, but if they're truly idle, maybe they should be eventually be entirely closed, and save eviction from a much slower process of discovering pages. For example, when running beneath the close_handle_minimum , we could use an expiry time that is different than close_idle_time , say 5 times that number, or even set by a new config key.
For disagg in particular, the system does not work well because of outdated btrees. On a follower, we know when we pick up a checkpoint that we'll stop using stable btrees at an old checkpoint, and want an entirely new btree based on the newer checkpoint. We tag the old dhandles as outdated, and we really want them swept and closed as soon as all cursors are closed on them. But with close_handle_minimum at 250, that won't happen. If we have 10 tables, we might have the last 25 "old" versions of btrees lying around open. They'll never be used again. This is a rare case where having a small number of dhandles is a distinct disadvantage.
A simple "solution" is to modify the close_handle_minimum default to be much less (10? or 1?), or much less for disagg only (a little weird to have multiple defaults), or have mongod set the config string for disagg to have a low number. Another way is to explicitly track the number of outdated dhandles in the system, then we can know the number of useful btrees that are open. Or even without tracking, we can always walk the entire list of btrees, closing outdated ones immediately, and only closing others to satisfy the close_handle_minimum. These are all easy solutions.
We'll need to address this for disagg, but it may want to think about what we're doing in ASC as well, and how best to manage this parameter.
- is related to
-
WT-16221 Dhandle sweep should quickly expire outdated btrees
-
- In Code Review
-