Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31234

Hyperloglog Counting

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
    • Query Optimization

      The use case is to count the number of distinct elements where the set size is very large, and we need approximate carnality

      Presently to count the number of distinct elements in a set while grouping there are two ways-

      1. $addToSet followed by $size.
      2. $group with the element in _id followed by another $group stage which collects and counts all such documents.

      The first approach has a problem that the 16MB document size limit may be reached pretty fast. The second approach has a lot of memory overhead and thus is very slow.

      A hyperloglog based approach would help reduce the overheads and probably will be faster.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            hyades Aayush
            Votes:
            7 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: