Core Server
  1. Core Server
  2. SERVER-1752

improve the performance of simple counts

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Major - P3 Major - P3
    • Resolution: Fixed
    • Affects Version/s: 1.7.0
    • Fix Version/s: 2.3.2
    • Component/s: Performance
    • Labels:
      None
    • Environment:
      Linux
    • Backport:
      No
    • # Replies:
      50
    • Last comment by Customer:
      false
    • Documentation changes needed?:
      Yes

      Description

      Summary:

      Two optimizations have been implemented for the count operation:

      1) Normally, when documents are retrieved from an indexed Cursor, the index key and/or document at each position of the cursor are checked by a Matcher to see if they match the query. As an optimization, this Matcher check is now bypassed in simple cases when the documents iterated by the Cursor must always match the query. Specifically, if an 'Optimal' btree index is used to perform a count, and the index bounds for the count's query spec on that index are determined to exactly describe the documents that match the query, then a Matcher is not used. An explain( true ) on a find with the same query spec as the count will generally indicate if an 'Optimal' index is used (generally the case if there is only one btree plan in allPlans) and will show the indexBounds on the index.

      2) Normally, when a btree cursor iterates over keys in a btree, the cursor checks every key traversed to see if it falls within the calculated indexBounds. As an optimization, this key check is now bypassed in simple cases where the iteration endpoint can be precomputed. Specifically, if an 'Optimal' btree index is used to perform a count, and the indexBounds describe a single interval within the btree, then the endpoint of that interval is located in advance so that the traversed keys do not need to be individually checked for inclusion in the interval. An explain( true ) of an optimal index will generally indicate usage of a single interval when the cursor explain field does not have a "multi" suffix.

      Aaron

      ---------------------------------

      The count performance is so pool that we can not use it,if the client must wait 5000 millis for every count request,they will be unhappy!
      db.Comment.count() done quickly,but db.Comment.count(

      {appId:1,topicId:1}

      ) need so much more time.
      I use mongo 1.7.0:
      {
      "cursor" : "BtreeCursor appId_1_topicId_1",
      "nscanned" : 2101980,
      "nscannedObjects" : 2101980,
      "n" : 2101980,
      "millis" : 4677,
      "indexBounds" :

      { "appId" : [ [ 1, 1 ] ], "topicId" : [ [ 103, 103 ] ] }

      }

      a simple count will need 4677ms,if the result is smaller,eg 51140 but not 2101980,the count query will done less than 200ms!
      {
      "cursor" : "BtreeCursor appId_1_topicId_1",
      "nscanned" : 51140,
      "nscannedObjects" : 51140,
      "n" : 51140,
      "millis" : 108,
      "indexBounds" :

      { "appId" : [ [ 1, 1 ] ], "topicId" : [ [ 1, 1 ] ] }

      }

      the count() can not use explain(),so I use db.Comment.find(

      {appId:1,topicId:1}

      ).explain(),but the result maybe the same.

        Issue Links

          Activity

          Hide
          auto
          added a comment -

          Author:

          {u'date': u'2012-11-20T10:36:10Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'}

          Message: SERVER-1752 Fix debug build tests by removing old imprecise dassert in BtreeBucket56Bit.
          Branch: master
          https://github.com/mongodb/mongo/commit/08cfe5f5054eff0830c12289c38e7c49460ce86d

          Show
          auto
          added a comment - Author: {u'date': u'2012-11-20T10:36:10Z', u'email': u'aaron@10gen.com', u'name': u'Aaron'} Message: SERVER-1752 Fix debug build tests by removing old imprecise dassert in BtreeBucket56Bit. Branch: master https://github.com/mongodb/mongo/commit/08cfe5f5054eff0830c12289c38e7c49460ce86d
          Hide
          Jon Hyman
          added a comment - - edited

          Is there any way this could be backported into a 2.2.x build? Without a clear understanding of the 2.4 release timeframe (since it's not on the roadmap AFAICT), we would really benefit from this right now. While we've precomputed a great deal of queries we otherwise would have used count() for, some of our count queries are hard to precompute and take a long time.

          Show
          Jon Hyman
          added a comment - - edited Is there any way this could be backported into a 2.2.x build? Without a clear understanding of the 2.4 release timeframe (since it's not on the roadmap AFAICT), we would really benefit from this right now. While we've precomputed a great deal of queries we otherwise would have used count() for, some of our count queries are hard to precompute and take a long time.
          Hide
          Eliot Horowitz
          added a comment -

          Jon - no, unfortunately a change like this is not back portable as its relatively large, and stability on 2.2 is paramount.
          2.4 should be in Q1 of 2013, the dev cycle for it is wrapping up in the next couple of weeks.

          Show
          Eliot Horowitz
          added a comment - Jon - no, unfortunately a change like this is not back portable as its relatively large, and stability on 2.2 is paramount. 2.4 should be in Q1 of 2013, the dev cycle for it is wrapping up in the next couple of weeks.
          Hide
          Jon Hyman
          added a comment -

          Okay, thanks for the update. In general, you should update the "We aim to do a stable release every 3 months" tag on jira. 2.2 came out almost a year after 2.0, so I was worried that 2.4 was going to be far into the future.

          Show
          Jon Hyman
          added a comment - Okay, thanks for the update. In general, you should update the "We aim to do a stable release every 3 months" tag on jira. 2.2 came out almost a year after 2.0, so I was worried that 2.4 was going to be far into the future.
          Hide
          Eliot Horowitz
          added a comment -

          Jon - thanks, forgot about that message, updated.

          Show
          Eliot Horowitz
          added a comment - Jon - thanks, forgot about that message, updated.

            People

            • Votes:
              134 Vote for this issue
              Watchers:
              118 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 17 weeks, 1 day ago
                Date of 1st Reply: