[SERVER-31234] Hyperloglog Counting Created: 24/Sep/17 Updated: 20/Oct/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Aayush | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 7 |
| Labels: | expression | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Optimization
|
| Participants: |
| Description |
|
The use case is to count the number of distinct elements where the set size is very large, and we need approximate carnality Presently to count the number of distinct elements in a set while grouping there are two ways-
The first approach has a problem that the 16MB document size limit may be reached pretty fast. The second approach has a lot of memory overhead and thus is very slow. A hyperloglog based approach would help reduce the overheads and probably will be faster. |
| Comments |
| Comment by Arturs Sosins [ 23/Feb/23 ] | |||||||||||||||||||||
|
It does not have to be "the" cardinality counting pipeline operator. It could be literally $hyperloglog operator/stage to do this one specific thing To get users per day
get the count for each day
or merge multiple sets to get the total count
| |||||||||||||||||||||
| Comment by apocarteres [ 16/Nov/17 ] | |||||||||||||||||||||
|
i guess it's not possible unless MongoDB will be supporting plugins. Having implemented cardinality counting with hardcoded HLL+ is going to produce backward compatibility issues in case MongoDB will decided to pickup something else to count cardinality in future. | |||||||||||||||||||||
| Comment by Kelsey Schubert [ 25/Sep/17 ] | |||||||||||||||||||||
|
Hi hyades, Thank you for the feature request; I've marked it for consideration. Please continue to watch this ticket for updates. Kind regards, |