-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
-
None
-
None
-
None
-
None
-
None
-
None
-
None
We have an observed production pattern where Atlas observability runs a $queryStats aggregation of the form:
[
{ $queryStats: {} },
{ $project: { _id: true, keyHash: true, metrics: { workingTimeMillis: { sum: true } } } },
{ $sort: { "metrics.workingTimeMillis.sum": -1 } },
{ $limit: 5000 }
]
This is one of the two $queryStats pipelines run every 60 seconds, and it has been called out as a recurring expensive shape on large clusters with high query-shape cardinality.
In the customer investigation discussed in Slack, a slow query log for this exact pattern showed about 9.8s of CPU / working time for a single invocation. On the same cluster, serverStatus.metrics.queryStats showed roughly 300k+ entries and an estimated store size of about 600 MB+, which makes a full scan + sort plausible but still expensive enough to warrant optimization.
The thread also explicitly distinguishes this from the existing keyHash lookup optimization work in SERVER-125570. That ticket helps the $match / exact-hash pattern, but the expensive pipeline here is the top-k sort case without a $match.
Requested improvement:
Teach $queryStats to recognize and optimize the common top-k pattern:
- $queryStats
- optional narrow projection
- {{$sort: { "metrics.workingTimeMillis.sum": -1 }}}
- $limit: K
so that we do not pay the full downstream cost for every entry when the caller only needs the top K results.
A likely direction discussed in the thread is to defer the expensive query-shape/HMAC application work until after the top K entries are selected, rather than doing that work for the entire store.
Acceptance criteria:
- Detect the top-k sort pattern on metrics.workingTimeMillis.sum and avoid unnecessary full per-entry post-processing when only the top K results are needed.
- Preserve existing $queryStats semantics and result correctness.
- Demonstrably reduce CPU / latency for large-cardinality queryStats stores (e.g. hundreds of thousands of shapes).
- Add benchmark coverage for large stores and this exact pipeline shape.
- related to
-
SERVER-125570 Pushdown $match on keyHash into $queryStats stage to avoid full store scan
-
- Backlog
-