Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Assigned Teams:

Query Integration
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

If you run a query like this

db.diskio.distinct("metafield");

It will unpack every bucket in the collection. This can take a looong time. Uh oh!

In contrast if you do this

db.diskio.aggregate([{ $group: { _id: "$metafield" } }]);

It can be very fast because our optimizer is smart enough to not unpack any buckets at all.

The difference between the latency and CPU usage of these two queries is enormous, even though they're doing the same thing. We should be able to optimize this.

As motivation, this kind of query is super useful for creating a UI to interact with time series data. It lets a UI implementer answer the question: "what are the valid metafield values for this collection?" These are queries that ideally would be very fast.

Assignee:: Nishith Atreya
Reporter:: Chris Wolff
Participants:: Chris Wolff, Nishith Atreya
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: May 13 2026 11:02:34 PM UTC
Updated:: Jun 16 2026 06:39:39 PM UTC

Details

Description

Attachments

Activity

People

Dates