-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
QE 2024-12-09, QE 2024-12-23, QE 2025-01-06
When the dedup flag is set, OrStage tracks the ids of the records that have been already seen. Currently there is no limit on the amount of memory that can be used to store the seen recordIds.
To set the maximum allows memory, a new query knob internalOrStageMaxMemoryBytes will be added in query_knobs.idl.
The stage will spill to disk when the seen data structure exceeds the maximum memory allowed.
The spilling should be implemented in a method
void spill(unit64_t maximumMemoryUsage)
that will spill until the memory used by the stage is at most maximumMemoryUsage. The method should track the following metrics
- bool usedDisk : Set to true when the stage has spilled.
- uint64_t spills : The number of times the stage spilled.
- uint64_t spilledBytes : The size, in bytes, of the memory released with spilling.
- uint64_t spilledDataStorageSize : The size, in bytes, of disk space used for spilling.
To track those metrics, we should update the OrStats struct. The metrics should be reported in serverStatus and in explain execution stats.
Before spilling, the stage should make sure that there is enough disk space for spilling. This can be done using ensureSufficientDiskSpaceForSpilling and uassertStatusOK.
A second method, to retrieve the spilled data, should be added to allow the OrStage to execute reading data from disk. The method should make sure to keep the memory usage below the threshold at any moment.
The stage should release all memory and disk when it is closed.
- has to be done after
-
SERVER-97812 OrStage should use RecordIdDeduplicator to track duplicate recordIds
- Closed
- is depended on by
-
SERVER-24375 Deduping in OR, SORT_MERGE, and IXSCAN (multikey case) uses unbounded memory
- Backlog