[DOCS-11534] Docs for SERVER-3645: Sharded collection counts (on primary) can report too many results Created: 03/Apr/18 Updated: 29/Oct/23 Resolved: 17/May/18 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 3.7.4 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Kay Kim (Inactive) | Assignee: | Kay Kim (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Days since reply: | 5 years, 38 weeks ago | ||||||||
| Epic Link: | DOCS: 4.0 Server | ||||||||
| Description |
Documentation Request Summary:The page below will need to be updated: The following paragraphs should be changed: To avoid these situations, on a sharded cluster, use the $group stage of the db.collection.aggregate() method to $sum the documents. For example, the following operation counts the documents in a collection: The new behavior is that, when running a sharded cluster, a fast count may return inaccurate results. A count() with a predicate will not (as of 4.0). See the "Behavior of "fast count" and non-"fast count" in the description of the ticket. If you have any questions feel free to slack/email/comment on a ticket! Scope of changes:
Impact to other docs outside of this product:none MVP:Resources:
Engineering Ticket Description:SummaryCount does not filter out unowned (orphaned) documents and can therefore report larger values than one will find via a normal query, or using itcount() in the shell. CausesThe following conditions can lead to counts being off:
WorkaroundA workaround to get accurate counts is to ensure all migrations have been cleaned up and no migrations are active. To query non-primaries you must also ensure that there is no replication lag including any migration data, in addition to the above requirements. Non-Primary ReadsFor issues with counts/reads from non-primaries please see Behavior of "fast count" and non-"fast count"A "fast count" is a count run without a predicate. It is "fast" because the implementation only reads the metadata, without fetching any documents. The problem of count() reporting inaccurate results has been fixed for non-"fast counts," that is, starting in 4.0, counts which are run with a predicate are accurate when run on sharded clusters. "Fast counts" (count() run without a predicate) may still report too many documents (see SERVER-33753). In general, if one needs an accurate count of how many documents are in a collection, we do not recommend using the count command. Instead, we suggest using the $count aggregation stage, like this:
See the docs. For users who need the performance of "fast count", and are okay with approximate results, we suggest using $collStats instead of the count command:
|
| Comments |
| Comment by Githook User [ 23/May/18 ] |
|
Author: {'username': 'kay-kim', 'name': 'kay', 'email': 'kay.kim@10gen.com'}Message: |
| Comment by Githook User [ 17/May/18 ] |
|
Author: {'email': 'kay.kim@10gen.com', 'username': 'kay-kim', 'name': 'kay'}Message: |