[SERVER-3645] Sharded collection counts (on primary) can report too many results Created: 18/Aug/11  Updated: 27/Oct/22  Resolved: 30/Mar/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Bug Priority: Major - P3
Reporter: Greg Studer Assignee: Ian Boros
Resolution: Done Votes: 52
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-5366 balancer did not remove chunk from ol... Closed
is depended on by SERVER-5902 check fileMD5 failure for count() pro... Closed
Documented
is documented by DOCS-11534 Docs for SERVER-3645: Sharded collect... Closed
Duplicate
duplicates SERVER-48685 very high IO utilization after upgrad... Closed
is duplicated by SERVER-24079 Different result between db.count and... Closed
is duplicated by SERVER-5665 count command does not check chunk ra... Closed
is duplicated by SERVER-8178 Odd Count / Document Results Differen... Closed
is duplicated by SERVER-8405 sharded count may incorrectly count m... Closed
is duplicated by SERVER-12082 count() on a sharded cluster includes... Closed
is duplicated by SERVER-15092 count is greater than itcount Closed
is duplicated by SERVER-26038 Count and distinct operations do not ... Closed
is duplicated by SERVER-29742 mongodump only creates a partial dump Closed
is duplicated by SERVER-14319 Counts on sharded clusters should use... Closed
Problem/Incident
causes SERVER-39191 Performance regression for counts pos... Closed
Related
related to SERVER-33753 count without predicate should be sha... Backlog
related to SERVER-5931 Secondary reads in sharded clusters n... Closed
is related to SERVER-13116 distinct isn't sharding aware Backlog
is related to SERVER-70810 SHARDING_FILTER stage missing on shar... Backlog
is related to SERVER-30708 _id index returning more than one doc... Closed
is related to SERVER-8948 Count() can be wrong in sharded colle... Closed
is related to SERVER-26316 cleanupOrphaned command is too slow Closed
is related to SERVER-50857 Improve count() performance in sharde... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2018-03-12, Query 2018-03-26, Query 2018-04-09
Participants:
Case:
Linked BF Score: 0

 Description   

Summary

Count does not filter out unowned (orphaned) documents and can therefore report larger values than one will find via a normal query, or using itcount() in the shell.

Causes

The following conditions can lead to counts being off:

  • Active migrations
  • Orphaned documents (left from failed migrations)
  • Non-Primary read preferences (see SERVER-5931)

Workaround

A workaround to get accurate counts is to ensure all migrations have been cleaned up and no migrations are active. To query non-primaries you must also ensure that there is no replication lag including any migration data, in addition to the above requirements.

Non-Primary Reads

For issues with counts/reads from non-primaries please see SERVER-5931

Behavior of "fast count" and non-"fast count"

A "fast count" is a count run without a predicate. It is "fast" because the implementation only reads the metadata, without fetching any documents.

The problem of count() reporting inaccurate results has been fixed for non-"fast counts," that is, starting in 4.0, counts which are run with a predicate are accurate when run on sharded clusters. "Fast counts" (count() run without a predicate) may still report too many documents (see SERVER-33753).

In general, if one needs an accurate count of how many documents are in a collection, we do not recommend using the count command. Instead, we suggest using the $count aggregation stage, like this:

db.foo.aggregate([{$count: "nDocs"}]);

See the docs.

For users who need the performance of "fast count", and are okay with approximate results, we suggest using $collStats instead of the count command:

db.matrices.aggregate( [ { $collStats: { count: { } } } ] )



 Comments   
Comment by Asya Kamsky [ 05/Mar/19 ]

lucasoares this was a pretty large change, so we currently don't have plans to backport it.

Comment by Lucas [ 26/Feb/19 ]

Hello!

This will be backported to 3.6?

Thank you.

Comment by Githook User [ 30/Mar/18 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-3645 make count() with a predicate accurate on sharded clusters
Branch: master
https://github.com/mongodb/mongo/commit/82fce7bd6f6e2838ce3ccc72474ff8b8ebe325a0

Comment by Githook User [ 26/Mar/18 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: Revert 0c082a81047cee66821e295d02e3588f7934ff64: SERVER-3645
Branch: master
https://github.com/mongodb/mongo/commit/6e0fb3da005b6fbe14b30c28bf3a148412635150

Comment by Ian Whalen (Inactive) [ 21/Mar/18 ]

Apologies for the additional email, but something about JIRA's underlying DB got out of sync so I'm reopening and closing this to get the ticket into the correct resolved state.

Comment by Matt Muscari [ 21/Mar/18 ]

Unsubscribe
On Wed, Mar 21, 2018 at 7:43 AM Githook User (JIRA) <jira@mongodb.org>

Comment by Ian Boros [ 21/Mar/18 ]

See "Behavior of "fast count" and non-"fast count" section in the description for the behavior for 4.0 on.

Comment by Githook User [ 21/Mar/18 ]

Author:

{'email': 'ian.boros@10gen.com', 'name': 'Ian Boros'}

Message: SERVER-3645 make count() with a predicate accurate on sharded clusters
Branch: master
https://github.com/mongodb/mongo/commit/0c082a81047cee66821e295d02e3588f7934ff64

Comment by Charlie Swanson [ 08/Jan/18 ]

This came up again during work on SERVER-31785. Putting this back in "Needs Triage" so we can re-evaluate the priority.

Comment by Holger Morch [ 08/Mar/17 ]

This is a 6 years old bug flagged as Major. Finally somebody is looking into it. Thanks.
Anyway the assignee is still "Backlog". Maybe it's fixed after another 6 years.

Comment by Farid Sharipov [ 10/Nov/15 ]

In our case we are seeing this problem in un-sharded collection with Mongo 3.0

mongos> db.messages.count()
15
mongos> db.messages.count()
15
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.count()
15
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find({})
mongos> db.messages.findOne()
null
mongos> db.messages.count()
15
mongos> db.messages.count()
15
mongos> db.messages.count()
15
mongos> db.messages.count()
15
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find()
mongos> db.messages.find().count()
15

Comment by Markus Mahlberg [ 28/Oct/15 ]

Interestingly, an aggregation counting the documents returns the correct value:

db.collection.aggregate({$group:{_id:"uniqueDocs",count:{$sum:1}}})

Comment by Patrick Lauber [ 03/Jun/15 ]

this bug cost me almost a week as i tried to understand why my import numbers differed from my source...

Comment by Jon Hyman [ 09/Dec/14 ]

Do you know if this is going to make it into 2.8? We have a 6 hour balancer window and our counts can be wrong during 25% of the day due to it.

Comment by Kevin J. Rice [ 28/Jul/14 ]

@Sam Flint - THANK YOU, that explain().n saves a bunch of time! Great hint!

Comment by sam flint [ 27/Jul/14 ]

You can use explain() to capture the correct count and this is much faster than itcount(). We put this in our client side application to call explain().n
As you can see it is accurate and it is faster than itcount().
"cursor" : "BtreeCursor client_id_1_lists_1_order_1",
"n" : 5487153,
"nChunkSkips" : 17072,
"nYields" : 11907,
"nscanned" : 5672905,
"nscannedAllPlans" : 5672905,
"nscannedObjects" : 5672905,
"nscannedObjectsAllPlans" : 5672905,
"millisShardTotal" : 69749,
"millisShardAvg" : 9964,
"numQueries" : 7,
"numShards" : 7,
"millis" : 18282
}
mongos> db.profile.find(

{client_id : 3762}

,

{client_id:1}

).count()
5503724
mongos> db.profile.find(

{client_id : 3762}

,

{client_id:1}

).itcount()
5487153
mongos> db.profile.find(

{client_id : 3762}

,

{client_id:1}

).explain().n
5487153

Comment by Srinivas Mutyala [ 24/Apr/14 ]

Additional issues with Sharded set-up & when it's sharding.

1) After the MongoDB Initial sync, the number of documents in the individual shards are nearly equal – but not exactly.

2) During the sync in progress, if we query the total number of documents thru the mongos(router) we’re seeing indefinite results. Ideally, it should be same number all the time.If the issue is with count() is fine, but what about the data consistency.

3) Initial sync time is directly proportional to the total data size and very slower. Need a fix for it.

Comment by Asya Kamsky [ 09/Jan/14 ]

This ticket is tracking count() - which uses metadata for collection to quickly get the total count of documents.

There is a different ticket tracking the fact that when you query secondaries with broadcast query (i.e. untargeted, not involving the shard key) and there is either migration in progress or orphan documents left from an aborted migration, the secondary doesn't know to filter them out the way the primary would. That ticket is https://jira.mongodb.org/browse/SERVER-5931 - the workaround of reading from primaries when using non-targeted queries will work for you. If you are using targeted queries (one with the shard key) then this should be a problem whether you are on primaries or secondaries.

Comment by Jon Hyman [ 09/Jan/14 ]

No, I have a query criteria. I see; I only get the issues when using secondary reads, so perhaps it is something unrelated to this issue. I patched my driver (Moped) to print out information, here's what I got from a count. Though as I mentioned, I can also reproduce this when querying over the data and incrementing my own counter, so it's not just on a count().

As far as I can tell, when it is bad, it is always the same shard. ObjectRocket, our hosting provider, pointed me to this issue but perhaps we can troubleshoot separately.

GOOD (returns a count of 101844)

Response [225, -645272788, 23, 1, 0, 0, 0, 1]
reply doc {"shards"=>

{"2f7abb37320b715c8ed68c86d29d93a7"=>25754, "cabe9e1b214ce57538a20ca6688a8ee0"=>25300, "dcb91ad1cd3630601020f83d7b6883e0"=>25581, "f1678636cab6fa77c76a9264ea9963a7"=>25209}, "n"=>101844, "ok"=>1.0}
Operation #<Moped::Protocol::Command
@length=145
@request_id=23
@response_to=0
@op_code=2004
@flags=[:slave_ok]
@full_collection_name="REDACTED.$cmd"
@skip=0
@limit=-1
@selector={:count=>"COLLECTION_REDACTED", :query=>{REDACTED}
@fields=nil>, reply #<Moped::Protocol::Reply
@length=225
@request_id=-645272788
@response_to=23
@op_code=1
@flags=[]
@cursor_id=0
@offset=0
@count=1
@documents=[{"shards"=>{"2f7abb37320b715c8ed68c86d29d93a7"=>25754, "cabe9e1b214ce57538a20ca6688a8ee0"=>25300, "dcb91ad1cd3630601020f83d7b6883e0"=>25581, "f1678636cab6fa77c76a9264ea9963a7"=>25209}

, "n"=>101844, "ok"=>1.0}]>
=> 101844

BAD (returns a count of 99503)

Response [225, -645270088, 24, 1, 0, 0, 0, 1]
reply doc {"shards"=>

{"2f7abb37320b715c8ed68c86d29d93a7"=>23413, "cabe9e1b214ce57538a20ca6688a8ee0"=>25300, "dcb91ad1cd3630601020f83d7b6883e0"=>25581, "f1678636cab6fa77c76a9264ea9963a7"=>25209}, "n"=>99503, "ok"=>1.0}
Operation #<Moped::Protocol::Command
@length=145
@request_id=24
@response_to=0
@op_code=2004
@flags=[:slave_ok]
@full_collection_name="REDACTED.$cmd"
@skip=0
@limit=-1
@selector={:count=>"COLLECTION_REDACTED", :query=>{REDACTED}
@fields=nil>, reply #<Moped::Protocol::Reply
@length=225
@request_id=-645270088
@response_to=24
@op_code=1
@flags=[]
@cursor_id=0
@offset=0
@count=1
@documents=[{"shards"=>{"2f7abb37320b715c8ed68c86d29d93a7"=>23413, "cabe9e1b214ce57538a20ca6688a8ee0"=>25300, "dcb91ad1cd3630601020f83d7b6883e0"=>25581, "f1678636cab6fa77c76a9264ea9963a7"=>25209}

, "n"=>99503, "ok"=>1.0}]>
=> 99503

Comment by Asya Kamsky [ 09/Jan/14 ]

jonhyman are these straight count() without a condition? This is not tied to secondary reads - during migrations the same documents exist on more than one shard (on the primary) on a perfectly functioning system.

Comment by Jon Hyman [ 09/Jan/14 ]

Hi there,

We just noticed this bug (it's been happening for a long time from our logs). We have a sharded setup and run a count over some documents using secondary reads.

While troubleshooting, I noticed that even if I iterated over the query and incremented my own counter, I would get the same wrong result part of the time. It's extremely sporadic, I'll go periods of the day where if I run the same query over and over again, it returns the correct count every other query and the incorrect count every other query.

The problem here is now I'm extremely distrusting of secondary reads. Our application uses secondary reads in a lot of places because we have hundreds of millions of documents. It seems like I need to remove all secondary read preferences from my data. Does that seem like the right solution to you? If so, this is horrible IMO and secondary reads are broken in the worst way (silently returning incorrect data).

Comment by Daniel Pasette (Inactive) [ 06/Nov/13 ]

This is a major change and unfortunately did not make the cut for 2.6

Comment by James Smith [ 05/Nov/13 ]

Is a fix for this targeted for 2.6?

Comment by auto [ 05/Oct/11 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: buildbot fix test b/c of SERVER-3645 on migrates
Branch: master
https://github.com/mongodb/mongo/commit/d8f91a17afa59a31222eeb377690a88af74be498

Comment by Eliot Horowitz (Inactive) [ 18/Aug/11 ]

We can' do a real count - need to count more meta data

Generated at Thu Feb 08 03:03:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.