Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-1752

improve the performance of simple counts

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.3.2
    • Affects Version/s: 1.7.0
    • Component/s: Performance
    • Labels:
      None
    • Environment:
      Linux

      Summary:

      Two optimizations have been implemented for the count operation:

      1) Normally, when documents are retrieved from an indexed Cursor, the index key and/or document at each position of the cursor are checked by a Matcher to see if they match the query. As an optimization, this Matcher check is now bypassed in simple cases when the documents iterated by the Cursor must always match the query. Specifically, if an 'Optimal' btree index is used to perform a count, and the index bounds for the count's query spec on that index are determined to exactly describe the documents that match the query, then a Matcher is not used. An explain( true ) on a find with the same query spec as the count will generally indicate if an 'Optimal' index is used (generally the case if there is only one btree plan in allPlans) and will show the indexBounds on the index.

      2) Normally, when a btree cursor iterates over keys in a btree, the cursor checks every key traversed to see if it falls within the calculated indexBounds. As an optimization, this key check is now bypassed in simple cases where the iteration endpoint can be precomputed. Specifically, if an 'Optimal' btree index is used to perform a count, and the indexBounds describe a single interval within the btree, then the endpoint of that interval is located in advance so that the traversed keys do not need to be individually checked for inclusion in the interval. An explain( true ) of an optimal index will generally indicate usage of a single interval when the cursor explain field does not have a "multi" suffix.

      Aaron

      ---------------------------------

      The count performance is so pool that we can not use it,if the client must wait 5000 millis for every count request,they will be unhappy!
      db.Comment.count() done quickly,but db.Comment.count(

      {appId:1,topicId:1}

      ) need so much more time.
      I use mongo 1.7.0:
      {
      "cursor" : "BtreeCursor appId_1_topicId_1",
      "nscanned" : 2101980,
      "nscannedObjects" : 2101980,
      "n" : 2101980,
      "millis" : 4677,
      "indexBounds" :

      { "appId" : [ [ 1, 1 ] ], "topicId" : [ [ 103, 103 ] ] }

      }

      a simple count will need 4677ms,if the result is smaller,eg 51140 but not 2101980,the count query will done less than 200ms!
      {
      "cursor" : "BtreeCursor appId_1_topicId_1",
      "nscanned" : 51140,
      "nscannedObjects" : 51140,
      "n" : 51140,
      "millis" : 108,
      "indexBounds" :

      { "appId" : [ [ 1, 1 ] ], "topicId" : [ [ 1, 1 ] ] }

      }

      the count() can not use explain(),so I use db.Comment.find(

      {appId:1,topicId:1}

      ).explain(),but the result maybe the same.

            Created:
            Updated:
            Resolved: