[SERVER-57062] Improve memory footprint of document/value caching in window functions Created: 19/May/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Nicholas Zolnierz Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-57130 Traverse arrays while filling the Doc... Closed
Assigned Teams:
Query Optimization
Participants:

 Description   

As part of the commit to account for freed documents when they fall out of a window, we made an unfortunate compromise to handle the inflation of the internal Document cache by forcing all fields to be cached up front. This is not efficient and may end up forcing a spill to disk more often than necessary. Several options discussed to avoid doing this for every document:

  • Use the required fields from the dependency analysis to only populate the Document cache for the fields we need. Note that if a required field lands on an object, some expressions (e.g. $max) will recurse into the nested object to do a field-by-field comparison.
  • Detect the changing in memory footprint on each expression evaluation. This one is tedious and likely to break if we were to add more functions or window types.
  • Temporarily turn off caching in Document while the $setWindowFields stage is holding onto it, since the stage will eventually project a new field it should not affect performance. This could be interesting to try but may be invasive.
  • Add a callback to Document when a field is added to the cache, such that window functions can adjust its memory tracking in near real-time

Generated at Thu Feb 08 05:40:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.