Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-50857

Improve count() performance in sharded clusters

    • Query 2020-10-05, Query 2020-10-19, Query 2020-11-02

      There is a huge performance regression when doing a simple count on an array of string in 4.4 (also in 4.4.1).

      I don't know if this is a 4.4 issue or if it appeared earlier because we are in the process of migration from 3.4 to 4.4. Still, here the complete description of the issue.

      We have a "test" collection of 2M documents, which almost all have a field "labels" like that (among a lot of other fields, these are quite large documents) :

      {"labels": ["aaaaa", "bbbbb", "ccccc"]}
      {"labels": ["ddddd"]}
      {"labels": []}
      {"labels": ["aaaaa", "ccccc"]}
      

      * There are about 120+ different values in the "labels" field

      • Some values are present only once, some hundreds of thousands of times.
      • The field "labels" has a simple index (not sparse nor partial)
      • In the sharded environment, the collection is correctly balanced according to sh.balancerCollectionStatus

      Running a "distinct" is very fast and seems to be using the index, whichever the test environment is (3.4, 4.4, sharded or not).

      db.test.distinct("labels");
      [
      	"aaaaa",
      	"bbbbb",
      	"ccccc",
      	...
      ]
      

      But when it comes to counting....

      Mongod 3.4, unsharded collection:

      db.test.count({"labels": "aaaaa"}); --- 1 occurrence, 0,1s
      db.test.count({"labels": "ccccc"}); --- 430000 occurrences, 0,3s
      db.test.count({"labels": "NON EXISTANT VALUE"}); --- 0 occurrence, 0,1s
      

      Mongod 3.4, sharded collection:

      db.test.count({"labels": "aaaaa"}); --- 1 occurrence, 0,1s
      db.test.count({"labels": "ccccc"}); --- 430000 occurrences, 0,2s
      db.test.count({"labels": "NON EXISTANT VALUE"}); --- 0 occurrence, 0,1s
      

      Mongod 4.4.1, sharded collection:

      db.test.count({"labels": "aaaaa"}); --- 1 occurence, 0,1s
      db.test.count({"labels": "ccccc"}); --- 430000 occurrences, 10+s << PROBLEM IS HERE
      db.test.count({"labels": "NON EXISTANT VALUE"}); --- 0 occurence, 0,1s
      

      Mongod 4.4.1, unsharded collection:

      db.test.count({"labels": "aaaaa"}); --- 1 occurrence, 0,1s
      db.test.count({"labels": "ccccc"}); --- 430000 occurrences, 0,3s
      db.test.count({"labels": "NON EXISTANT VALUE"}); --- 0 occurrence, 0,1s
      

      I am attaching the complete explains for you to see the behavior in the different environments, but as you can see, in the 4.4 sharded collection, each shard does an IXSCAN (60ms), then a FETCH (10s) then a SHARDING FILTER, etc. and what takes time is the fetching and later stages. It does take advantages of the index at all, whereas the single shard / not sharded version does.

      This has a HUGE impact on performance, and doing that with an aggregate is slow as well (because of the $unwind then $group pattern). In our application, because the aggregate is slow to do that operation (36s), it's WAY faster in 3.4 to use distinct then a count of each value (takes less than a second even in the sharded env with the 2M documents and 120 differents values).

      This also happens with non-multikey fields / index (as seen in the attached files).

        1. explain-4.4.1-sharded-multikey.json
          21 kB
        2. explain-4.4.1-sharded-1.json
          21 kB
        3. explain-4.2.7-sharded.json
          2 kB
        4. explain-3.4-unsharded-1.json
          5 kB
        5. explain-3.4-sharded-1.json
          18 kB

            Assignee:
            ian.boros@mongodb.com Ian Boros
            Reporter:
            hmducoulombier@marketing1by1.com Henri-Maxime Ducoulombier
            Votes:
            2 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: