[SERVER-73717] When using in-memory storage engine spill to disk doesn't always work Created: 07/Feb/23  Updated: 11/Apr/23  Resolved: 11/Apr/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adi Agrawal Assignee: Backlog - Query Execution
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-73757 Calling storageSize() on ephemeral te... Backlog
related to SERVER-74335 Spill to disk tests for $group must b... Closed
is related to SERVER-36388 inMemory storage engines should not s... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Participants:

 Description   

The Hash Agg stage in the SBE engine spills using a record store as opposed to files as in the classic engine. There is a statistic query execution tracks "spilledDataStorageSize" introduced in https://jira.mongodb.org/browse/SERVER-73311 which reports the bytes spilled to the record store.

When testing this behavior on an in memory required build variant, the value of spilledDataStorage size is 0, even though spilling is allowed. ([patch build: ). 

This poses the question:

Should we even allow spilling to record stores when running on an in memory build?

If no, we can disable spilling when in memory stores. 

If yes, we can spill to a file instead of record store, or ignore the memory limit.



 Comments   
Comment by David Storch [ 07/Feb/23 ]

I see the following options:

  1. Completely disallow spilling on in-memory. This would mean failing queries that hit the memory limit.
  2. On in-memory, allow the hash table for HashAggStage to grow without bound. This would prevent queries from failing due to reaching the memory limit but would mean that there is no memory constraint.
  3. On in-memory, spill to a non-ephemeral record store. This would be analogous to the old behavior, since the Sorter was allowed to spill persistent files to disk even when using the in-memory storage engine.
  4. Keep the behavior as is, meaning that we spill to an ephemeral TemporaryRecordStore.

Option (3) probably makes the most sense, but I'm not sure how easy it is to implement.

Generated at Thu Feb 08 06:25:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.