Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-61332

Introduce $approxCount stage that can work on a view

    • Type: Icon: New Feature New Feature
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • QO 2021-11-15, QO 2021-11-29, QO 2022-01-24, QO 2022-02-07, QO 2022-02-21

      The $count stage works on both collections and views, but always gives an exact count. The $collStats stage can produce a faster, estimated count, but it only works on a collection. $collStats also returns other information that might not be relevant or easy to define for a view.

      Let's introduce a new stage which:

      • like $count, can run anywhere in a pipeline, on a collection or view.
      • like $collStats, can provide a fast estimated count.
      • lets the user decide whether to allow falling back to a slower, exact count.

      Syntax could be something like:

      {$approxCount: {
          as: <string>,
          errorWhenNoApproximation: <default to true>,

      Implementation notes:

      • This would be a new subclass of DocumentSource.
      • getNext() can do whatever $collStats does to get the estimated count.
      • optimizeAt() can eliminate stages that preserve count. [$project, $approxCount] becomes just [$approxCount].
      • If an estimated count is not possible, and allowExact is true, then during optimization it can expand to the same $group stage that $count expands to. Then other optimizations may apply to the $group.
      • On a sharded collection, $approxCount can run partially on the shards and partially on the merger. Each shard can run its own $approxCount, and the merger can $group $sum the results.

            matt.boros@mongodb.com Matt Boros
            david.percy@mongodb.com David Percy
            2 Vote for this issue
            7 Start watching this issue