Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-35522

Make $collStats with count never return result field

    • Fully Compatible
    • ALL
    • Query 2020-11-30, Query 2020-12-14

      New Description:
      This ticket tracks the removal of the "result" field from the `$collStats` count results.

      Old Description:
      On a sharded cluster, running $collStats for a namespace that doesn't exist returns an empty 'result' array:

      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False)){'$clusterTime': {'clusterTime': Timestamp(1528553991, 1),
                        'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00',
                                      'keyId': 0}},
       'cursor': {'firstBatch': [], 'id': 0, 'ns': 'bar.bar'},
       'ok': 1.0,
       'operationTime': Timestamp(1528553991, 1),
       'result': []}
      

      When the namespace does exist, it returns a document for each shard, but no 'result' array:

      >>> pprint.pprint(c.foo.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False)){'$clusterTime': {'clusterTime': Timestamp(1528554032, 1),
                        'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00',
                                      'keyId': 0}},
       'cursor': {'firstBatch': [{'count': 174899,
                                  'host': 'devbox:29018',
                                  'localTime': datetime.datetime(2018, 6, 9, 14, 20, 35, 202000),
                                  'ns': 'foo.bar',
                                  'shard': 'shard0001'},
                                 {'count': 125101,
                                  'host': 'devbox:29017',
                                  'localTime': datetime.datetime(2018, 6, 9, 14, 20, 35, 202000),
                                  'ns': 'foo.bar',
                                  'shard': 'shard0000'}],
                  'id': 0,
                  'ns': 'foo.bar'},
       'ok': 1.0,
       'operationTime': Timestamp(1528554032, 1)}
      

      Standalones and replica sets never return a 'result' field:

      >>> pprint.pprint(c.foo.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
      {'cursor': {'firstBatch': [{'count': 1,
                                  'host': 'devbox:57017',
                                  'localTime': datetime.datetime(2018, 6, 9, 14, 21, 0, 164000),
                                  'ns': 'foo.bar'}],
                  'id': 0,
                  'ns': 'foo.bar'},
       'ok': 1.0}
      >>> pprint.pprint(c.bar.command('aggregate', 'bar', pipeline=[{"$collStats": {"count": {}}}], cursor={}, check=False))
      {'code': 40481,
       'codeName': 'Location40481',
       'errmsg': 'Unable to retrieve count in $collStats stage: Database [bar] not '
                 'found.',
       'ok': 0.0}
      

      I have no idea why mongos returns a result field when the namespace doesn't exist, but a result field would be really useful if it returned the sum of the counts from all shards. That would avoid application or driver developers having to $group and $sum the results from each shard. If we were to do that, mongod should also return a result field for consistency between topology types.

            Assignee:
            samuel.mercier@mongodb.com Sam Mercier
            Reporter:
            bernie@mongodb.com Bernie Hackett
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: