Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14944

Use of Indexes for Query

    • Type: Icon: Question Question
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.4
    • Component/s: Querying
    • None

      I have a large collection of documents that look like:

      // Example Document 1
      {
        _id: 1,
        authorized_advertisers: [1,2,3],
        timestamp: 1406865600
      }
      
      // Example Document 2
      {
        _id: 2,
        authorized_advertisers: [1],
        timestamp: 1406865600
      }
      // Example Document 3
      {
        _id: 3,
        authorized_advertisers: [],
        timestamp: 1406865600
      }
      

      I have indexes on the following two indexes defined on this collection (among others):

      {
      		"v" : 1,
      		"key" : {
      			"authorized_advertisers" : 1
      		},
      		"name" : "authorized_advertisers_1",
      		"ns" : "cmply_app_engage_production.social_engagements",
      		"sparse" : true
      	}
      
      {
      		"v" : 1,
      		"key" : {
      			"authorized_advertisers" : 1,
      			"timestamp" : 1
      		},
      		"name" : "authorized_advertisers_1_timestamp_1",
      		"ns" : "cmply_app_engage_production.social_engagements"
      	}
      

      When I execute queries against this collection, a larger than (seemingly) necessary number of documents are scanned to produce the results. If I provide a hint to the query engine to use the authorized_advertisers_1_timestamp_1 index, the lower bound for the timestamp seems to be ignored.

      Criteria
      {
      	"authorized_advertisers" : ObjectId("4e6906f2dd06fd1f50000007"),
      	"timestamp" : {
      		"$lte" : 1408371025,
      		"$gt" : 1406865600
      	}
      }
      
      Query using $hint
      db.social_engagements.find(q).hint("authorized_advertisers_1_timestamp_1").explain()
      {
      	"cursor" : "BtreeCursor authorized_advertisers_1_timestamp_1",
      	"isMultiKey" : true,
      	"n" : 148823,
      	"nscannedObjects" : 1274509,
      	"nscanned" : 1274509,
      	"nscannedObjectsAllPlans" : 1274509,
      	"nscannedAllPlans" : 1274509,
      	"scanAndOrder" : false,
      	"indexOnly" : false,
      	"nYields" : 9959,
      	"nChunkSkips" : 0,
      	"millis" : 6447,
      	"indexBounds" : {
      		"authorized_advertisers" : [
      			[
      				ObjectId("4e6906f2dd06fd1f50000007"),
      				ObjectId("4e6906f2dd06fd1f50000007")
      			]
      		],
      		"timestamp" : [
      			[
      				-Infinity,
      				1408371025
      			]
      		]
      	},
      	"server" : "mydb.server.com:27017",
      	"filterSet" : false
      }
      
      Query withOUT $hint
      db.social_engagements.find(q).explain()
      {
      	"cursor" : "BtreeCursor authorized_advertisers_1",
      	"isMultiKey" : true,
      	"n" : 148823,
      	"nscannedObjects" : 1275405,
      	"nscanned" : 1275405,
      	"nscannedObjectsAllPlans" : 1275975,
      	"nscannedAllPlans" : 1275976,
      	"scanAndOrder" : false,
      	"indexOnly" : false,
      	"nYields" : 9969,
      	"nChunkSkips" : 0,
      	"millis" : 7162,
      	"indexBounds" : {
      		"authorized_advertisers" : [
      			[
      				ObjectId("4e6906f2dd06fd1f50000007"),
      				ObjectId("4e6906f2dd06fd1f50000007")
      			]
      		]
      	},
      	"server" : "mydb.server.com:27017",
      	"filterSet" : false
      }
      

      In each case, the same number of documents is returned. I don't understand why the lower bound for the timestamp is ignored, this would seem like it would reduce the number of documents the query would need to scan (in the collection or in the index) to produce a result.

      I've also tried having the index have the timestamp attribute as the first key in the index, but performance was significantly worse than my current configuration.

      These same criteria are used in Aggregation Framework queries, too. The queries take on average 7-10s to complete.

            Assignee:
            Unassigned Unassigned
            Reporter:
            michael@cmp.ly Michael De Lorenzo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: