Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31234

Hyperloglog Counting

    XMLWordPrintableJSON

Details

    • Icon: New Feature New Feature
    • Resolution: Unresolved
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Aggregation Framework
    • Query Optimization

    Description

      The use case is to count the number of distinct elements where the set size is very large, and we need approximate carnality

      Presently to count the number of distinct elements in a set while grouping there are two ways-

      1. $addToSet followed by $size.
      2. $group with the element in _id followed by another $group stage which collects and counts all such documents.

      The first approach has a problem that the 16MB document size limit may be reached pretty fast. The second approach has a lot of memory overhead and thus is very slow.

      A hyperloglog based approach would help reduce the overheads and probably will be faster.

      Attachments

        Activity

          People

            backlog-query-optimization Backlog - Query Optimization
            hyades Aayush
            Votes:
            7 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

              Created:
              Updated: