[DOCS-9121] db.collection.count documentation unclear on accuracy Created: 10/Oct/16  Updated: 30/Oct/23  Resolved: 08/Aug/17

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: 3.4.0
Fix Version/s: Server_Docs_20231030

Type: Bug Priority: Major - P3
Reporter: John Murphy Assignee: Ravind Kumar (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Days since reply: 6 years, 27 weeks, 2 days ago

 Description   

https://docs.mongodb.com/master/reference/method/db.collection.count/#behavior

This page suggests that db.collection.count() can be inaccurate when run against a sharded cluster,

However, if a query predicate is supplied, and the query is run via a mongoS, the count() will be accurate.

Please amend, for example by adding a note similar to the one below for "Accuracy after Unexpected Shutdown". Or by changing the wording to something along the lines of "On a sharded cluster, count() without a query clause can result in an inaccurate count if orphaned documents exist or if a chunk migration is in progress."



 Comments   
Comment by John Murphy [ 08/Aug/17 ]

Thanks all for the research into this one, I will await the results of SERVER-3645.

Comment by Ravind Kumar (Inactive) [ 07/Aug/17 ]

Thanks all - from what I'm gathering, the documentation is true as it stands.

Cheers.

Comment by David Storch [ 27/Jul/17 ]

renctan, that's correct, count plans still do not include the shard filter stage. This is tracked by SERVER-3645. Although this would be an easy fix in the case that the count command has a predicate, we still use a "fast count" when there is no predicate---meaning that we just read storage engine metadata to obtain the record count. This behavior makes it difficult for the count command to behave correctly in a sharded environment in all cases. Furthermore, I believe the WiredTiger record count metadata can sometimes be inaccurate, although you'd have to ask the storage team for the current state of that problem.

Comment by Randolph Tan [ 27/Jul/17 ]

After a quick look at the current code in master, it looks like the shard filter stage is still not included in the count command. david.storch, can you confirm that it is the case? In addition, secondaries don't keep track of sharding metadata as of v3.4 (this will change in v3.6 with safe secondary reads project), so there is no easy way for it to filter out orphan documents.

Generated at Thu Feb 08 07:57:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.