-
Type:
New Feature
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Storage Engines - Transactions
-
359.026
-
None
-
None
Background
The MongoDB query optimizer needs to estimate the number of leaf pages in a WiredTiger table for cost modeling and join-order optimization.
Today we approximate this as:
(WT_STAT_DSRC_BLOCK_SIZE - WT_STAT_DSRC_BLOCK_REUSE_BYTES) / leaf_page_max
i.e., bytes-on-disk / 32KB. In practice, actual leaf page sizes deviate significantly from leaf_page_max, giving ~40% error on many TPC-H collections compared to WT_STAT_DSRC_BTREE_ROW_LEAF.
WT_STAT_DSRC_BTREE_ROW_LEAF is accurate but requires a tree walk (all statistics), which is too expensive for production query planning.
Prototype
We [prototyped|https://github.com/10gen/mongo/pull/51825] a sampling approach: open a random cursor, cast to WT_CURSOR_BTREE, and read ref->page->entries to estimate avg docs-per-leaf-page. This reduced error to <5% on TPC-H, but it depends on private WT headers and internal structs, so it's not production-safe.
Request
Expose a stable stat or API (per table) that provides either:
- Approximate leaf page count, or
- Approximate avg records-per-leaf-page
without requiring a full tree walk.
Constraints:
- Overhead must be small enough for regular use in production query planning paths.
- Approximate / eventually consistent is fine, as long as it avoids large systematic bias (e.g., <10% avg error vs the current ~40%).
- Must not require MongoDB to depend on private headers or internal WT structures.
Possible approaches:
- A new DSRC stat tracking approximate leaf page count via incremental updates on splits/merges.
- A DSRC stat for avg records-per-leaf-page, updated via background sampling or during normal cursor traversal.
- A lightweight statistics level that enables this metric without a full tree walk.
- blocks
-
SERVER-124504 Join optimization testing: Re-enable compression by default in plan stability test
-
- Blocked
-
- related to
-
SERVER-124073 [Join Optimization] Use page sampling to estimate average number of docs/page and better estimate for numPages in collection
-
- Blocked
-