Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3645

Sharded collection counts (on primary) can report too many results

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.7.4
    • Component/s: Querying
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Query 2018-03-12, Query 2018-03-26, Query 2018-04-09
    • Case:

      CRM plugin field not viewable

      Description

      Summary

      Count does not filter out unowned (orphaned) documents and can therefore report larger values than one will find via a normal query, or using itcount() in the shell.

      Causes

      The following conditions can lead to counts being off:

      • Active migrations
      • Orphaned documents (left from failed migrations)
      • Non-Primary read preferences (see SERVER-5931)

      Workaround

      A workaround to get accurate counts is to ensure all migrations have been cleaned up and no migrations are active. To query non-primaries you must also ensure that there is no replication lag including any migration data, in addition to the above requirements.

      Non-Primary Reads

      For issues with counts/reads from non-primaries please see SERVER-5931

      Behavior of "fast count" and non-"fast count"

      A "fast count" is a count run without a predicate. It is "fast" because the implementation only reads the metadata, without fetching any documents.

      The problem of count() reporting inaccurate results has been fixed for non-"fast counts," that is, starting in 4.0, counts which are run with a predicate are accurate when run on sharded clusters. "Fast counts" (count() run without a predicate) may still report too many documents (see SERVER-33753).

      In general, if one needs an accurate count of how many documents are in a collection, we do not recommend using the count command. Instead, we suggest using the $count aggregation stage, like this:

      db.foo.aggregate([{$count: "nDocs"}]);
      

      See the docs.

      For users who need the performance of "fast count", and are okay with approximate results, we suggest using $collStats instead of the count command:

      db.matrices.aggregate( [ { $collStats: { count: { } } } ] )
      

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                52 Vote for this issue
                Watchers:
                84 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: