Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35479

Make $collStats count behavior have "standard" error code and consistent behavior across topologies.

    • Fully Compatible
    • ALL
    • Query 2020-11-30, Query 2020-12-14

      New Description:

      We have decided to fix the collstats count results. This ticket will track transitioning to a "standard" error code #26 Namespace doesn't exist, and ensuring that the result format is conforms to the format laid out in the scope document.

      Original Description:
      Title: $collStats behavior inconsistent across topology types when namespace does not exist

      When the namespace does not exist $collStats behavior depends on the topology type. When connected to a mongos, the aggregate command returns no results:

      >>> c.list_database_names()
      ['admin', 'config', 'foo']
      >>> pprint.pprint(c.admin.command('ismaster'))
      {'$clusterTime': {'clusterTime': Timestamp(1528382274, 1),
                        'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00',
                                      'keyId': 0}},
       'ismaster': True,
       'localTime': datetime.datetime(2018, 6, 7, 14, 38, 2, 567000),
       'logicalSessionTimeoutMinutes': 30,
       'maxBsonObjectSize': 16777216,
       'maxMessageSizeBytes': 48000000,
       'maxWireVersion': 7,
       'maxWriteBatchSize': 100000,
       'minWireVersion': 0,
       'msg': 'isdbgrid',
       'ok': 1.0,
       'operationTime': Timestamp(1528382274, 1)}
      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
      {'$clusterTime': {'clusterTime': Timestamp(1528382172, 1),
                        'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00',
                                      'keyId': 0}},
       'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'},
       'ok': 1.0,
       'operationTime': Timestamp(1528382172, 1),
       'result': []}
      

      When connected to a standalone server or replica set an undocumented, nonstandard error is returned (the error code is the same in both cases):

      >>> c.list_database_names()
      ['admin', 'config', 'local']
      >>> pprint.pprint(c.admin.command('ismaster'))
      {'ismaster': True,
       'localTime': datetime.datetime(2018, 6, 7, 14, 44, 2, 765000),
       'logicalSessionTimeoutMinutes': 30,
       'maxBsonObjectSize': 16777216,
       'maxMessageSizeBytes': 48000000,
       'maxWireVersion': 7,
       'maxWriteBatchSize': 100000,
       'minWireVersion': 0,
       'ok': 1.0,
       'readOnly': False}
      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
      {'code': 40481,
       'codeName': 'Location40481',
       'errmsg': 'Unable to retrieve count in $collStats stage: Database [bar] not '
                 'found.',
       'ok': 0.0}
      

      The behavior should be consistent regardless of topology. I think there are two choices for how to resolve this:

      • Change the behavior for standalone and replica set to match mongos. That is, just don't return any results. This would match the behavior of using $group + $sum (or $count) to count the documents in a collection.
      • Change the error code to a standard, documented error, and have mongos error as well. A good choice would be 26, NamespaceNotFound

      My personal preference is for the first option, which matches the behavior of $count:

      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$match": {}}, {"$count": "count"}], cursor={}, check=False))
      {'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'}, 'ok': 1.0}
      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$match": {}}, {"$group": {"_id": None, "count": {"$sum": 1}}}], cursor={}, check=False))
      {'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'}, 'ok': 1.0}
      

            Assignee:
            samuel.mercier@mongodb.com Sam Mercier
            Reporter:
            bernie@mongodb.com Bernie Hackett
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: