[SERVER-35479] Make $collStats count behavior have "standard" error code and consistent behavior across topologies. Created: 07/Jun/18  Updated: 29/Oct/23  Resolved: 07/Dec/20

Status: Closed
Project: Core Server
Component/s: Querying, Sharding
Affects Version/s: None
Fix Version/s: 4.9.0

Type: Bug Priority: Major - P3
Reporter: Bernie Hackett Assignee: Sam Mercier
Resolution: Fixed Votes: 0
Labels: query-44-grooming, storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-33039 mongod/mongos have inconsistent behav... Closed
Related
related to SERVER-53083 collStats results depend on topology ... Closed
related to SERVER-53268 Complete TODO listed in SERVER-35479 Closed
is related to SERVER-35522 Make $collStats with count never retu... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2020-11-30, Query 2020-12-14
Participants:
Case:

 Description   

New Description:

We have decided to fix the collstats count results. This ticket will track transitioning to a "standard" error code #26 Namespace doesn't exist, and ensuring that the result format is conforms to the format laid out in the scope document.

Original Description:
Title: $collStats behavior inconsistent across topology types when namespace does not exist

When the namespace does not exist $collStats behavior depends on the topology type. When connected to a mongos, the aggregate command returns no results:

>>> c.list_database_names()
['admin', 'config', 'foo']
>>> pprint.pprint(c.admin.command('ismaster'))
{'$clusterTime': {'clusterTime': Timestamp(1528382274, 1),
                  'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                        b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                        b'\x00\x00\x00\x00',
                                'keyId': 0}},
 'ismaster': True,
 'localTime': datetime.datetime(2018, 6, 7, 14, 38, 2, 567000),
 'logicalSessionTimeoutMinutes': 30,
 'maxBsonObjectSize': 16777216,
 'maxMessageSizeBytes': 48000000,
 'maxWireVersion': 7,
 'maxWriteBatchSize': 100000,
 'minWireVersion': 0,
 'msg': 'isdbgrid',
 'ok': 1.0,
 'operationTime': Timestamp(1528382274, 1)}
>>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
{'$clusterTime': {'clusterTime': Timestamp(1528382172, 1),
                  'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                        b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                        b'\x00\x00\x00\x00',
                                'keyId': 0}},
 'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'},
 'ok': 1.0,
 'operationTime': Timestamp(1528382172, 1),
 'result': []}

When connected to a standalone server or replica set an undocumented, nonstandard error is returned (the error code is the same in both cases):

>>> c.list_database_names()
['admin', 'config', 'local']
>>> pprint.pprint(c.admin.command('ismaster'))
{'ismaster': True,
 'localTime': datetime.datetime(2018, 6, 7, 14, 44, 2, 765000),
 'logicalSessionTimeoutMinutes': 30,
 'maxBsonObjectSize': 16777216,
 'maxMessageSizeBytes': 48000000,
 'maxWireVersion': 7,
 'maxWriteBatchSize': 100000,
 'minWireVersion': 0,
 'ok': 1.0,
 'readOnly': False}
>>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
{'code': 40481,
 'codeName': 'Location40481',
 'errmsg': 'Unable to retrieve count in $collStats stage: Database [bar] not '
           'found.',
 'ok': 0.0}

The behavior should be consistent regardless of topology. I think there are two choices for how to resolve this:

  • Change the behavior for standalone and replica set to match mongos. That is, just don't return any results. This would match the behavior of using $group + $sum (or $count) to count the documents in a collection.
  • Change the error code to a standard, documented error, and have mongos error as well. A good choice would be 26, NamespaceNotFound

My personal preference is for the first option, which matches the behavior of $count:

>>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$match": {}}, {"$count": "count"}], cursor={}, check=False))
{'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'}, 'ok': 1.0}
>>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$match": {}}, {"$group": {"_id": None, "count": {"$sum": 1}}}], cursor={}, check=False))
{'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'}, 'ok': 1.0}



 Comments   
Comment by Githook User [ 15/Dec/20 ]

Author:

{'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}

Message: SERVER-53268 Complete TODO listed in SERVER-35479
Branch: master
https://github.com/mongodb/mongo/commit/c425bdcf1862d642460211fcf450664233a9e6d0

Comment by Githook User [ 07/Dec/20 ]

Author:

{'name': 'samontea', 'email': 'merciers.merciers@gmail.com', 'username': 'samontea'}

Message: SERVER-35479 Make $collStats count behavior have "standard" error code and consistent behavior across topologies
Branch: master
https://github.com/mongodb/mongo/commit/c2deb97265c1b19a193ecdb58be1197bdbbd630f

Comment by Asya Kamsky [ 13/Nov/20 ]

It seems clear to me that for the user option one would be better and option two would be much much worse (backwards breaking and making things less consistent).

Comment by David Storch [ 08/Jun/18 ]

I closed SERVER-33039 as a duplicate of this newer ticket, so that we can keep any further discussion here.

Comment by Asya Kamsky [ 08/Jun/18 ]

Even with db/collection that exists, $collStats seems to return as many documents as there are shards rather than a single result.

Comment by Kyle Suarez [ 07/Jun/18 ]

I believe this is a duplicate of SERVER-33039.

Generated at Thu Feb 08 04:39:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.