-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Consider a pipeline:
[..., {sort: {_id: 1}}, {$limit: 1000}]
Current it will produce the following merge part:
mergeType: 'router', splitPipeline: { shardsPart: [ ..., { '$sort': { sortKey: { _id: 1 }, limit: Long('1000') } } ], mergerPart: [ { '$mergeCursors': { sort: { _id: 1 }, ... } }, { '$limit': Long('1000') } ] },
As you can see, we don't leverage limit in $mergeCursors.
$mergeCursors (AsyncResultsMerge under the hood) will buffer up to 16 MB from each shard and perform a merge sort.
If the amount of shards is large enough, it can consume a significant amount of RAM on a router.
However, if we know the limit, we can detect documents that are already guaranteed not to fit into a limit, so we can drop them (and even close corresponding remotes) saving up some RAM.
- related to
-
SERVER-79805 Explain command does not report executionStats for the merger part of a split pipeline
-
- Backlog
-