Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12925

Queries with projections limiting returned fields can result in full collection scans

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 2.6.0-rc0
    • Fix Version/s: 2.6.0-rc1
    • Component/s: None
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      > db.version()
      2.6.0-rc0
       
      > db.foo.ensureIndex({ id1:1 })
      WriteResult({ "nInserted" : 1 })
      > db.foo.ensureIndex({ id2:1 })
      WriteResult({ "nInserted" : 1 })
      > for( var i=1; i<=500000; i++ ){db.foo.save({ id1:i, id2:i })}
      WriteResult({ "nInserted" : 1 })
      > db.foo.find({ id1:{ $lte:10 }, id2:1 }).explain()
      {
              "cursor" : "BtreeCursor id1_1",
              "isMultiKey" : false,
              "n" : 1,
              "nscannedObjects" : 10,
              "nscanned" : 10,
              "nscannedObjectsAllPlans" : 12,
              "nscannedAllPlans" : 15,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 0,
              "nChunkSkips" : 0,
              "millis" : 1,
              "indexBounds" : {
                      "id1" : [
                              [
                                      -Infinity,
                                      10
                              ]
                      ]
              },
              "server" : "Jeffs-MacBook-Air.local:27017",
              "filterSet" : false
      }
      > db.foo.find({ id1:{ $lte:10 }, id2:1 }, { _id:0, id1:1, id2:1 }).explain()
      {
              "cursor" : "BasicCursor",
              "isMultiKey" : false,
              "n" : 1,
              "nscannedObjects" : 500000,
              "nscanned" : 500000,
              "nscannedObjectsAllPlans" : 500003,
              "nscannedAllPlans" : 500007,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 3906,
              "nChunkSkips" : 0,
              "millis" : 303,
              "server" : "Jeffs-MacBook-Air.local:27017",
              "filterSet" : false
      }

      Show
      > db.version() 2.6.0-rc0   > db.foo.ensureIndex({ id1:1 }) WriteResult({ "nInserted" : 1 }) > db.foo.ensureIndex({ id2:1 }) WriteResult({ "nInserted" : 1 }) > for( var i=1; i<=500000; i++ ){db.foo.save({ id1:i, id2:i })} WriteResult({ "nInserted" : 1 }) > db.foo.find({ id1:{ $lte:10 }, id2:1 }).explain() { "cursor" : "BtreeCursor id1_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 10, "nscanned" : 10, "nscannedObjectsAllPlans" : 12, "nscannedAllPlans" : 15, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 1, "indexBounds" : { "id1" : [ [ -Infinity, 10 ] ] }, "server" : "Jeffs-MacBook-Air.local:27017", "filterSet" : false } > db.foo.find({ id1:{ $lte:10 }, id2:1 }, { _id:0, id1:1, id2:1 }).explain() { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 500000, "nscanned" : 500000, "nscannedObjectsAllPlans" : 500003, "nscannedAllPlans" : 500007, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 3906, "nChunkSkips" : 0, "millis" : 303, "server" : "Jeffs-MacBook-Air.local:27017", "filterSet" : false }

      Description

      Hi there,

      It seems that if you specify fields to return in a projection, the query can end up doing a full collection scan even if there are indexes that should support the query predicates.

      This doesn't appear to be an explain bug from what I can tell.

        Activity

        Hide
        rassi@10gen.com Jason Rassi added a comment -

        Could you paste the output of re-running those two queries with explain(true)? That gives verbose explain output.

        Show
        rassi@10gen.com Jason Rassi added a comment - Could you paste the output of re-running those two queries with explain(true)? That gives verbose explain output.
        Hide
        jlee Jeff lee (Inactive) added a comment -

        Sure thing...here ya go. Note that I reduced the size of the collection and simplified the projection.

        db.foo.ensureIndex({ id1:1 })
        db.foo.ensureIndex({ id2:1 })
        for( var i=1; i<= 10000; i++ ){db.foo.save({ id1:i, id2:i })}
         
        > db.foo.find({ id1:{ $lte:10 }, id2:1 }).explain(true)
        {
        	"cursor" : "BtreeCursor id1_1",
        	"isMultiKey" : false,
        	"n" : 1,
        	"nscannedObjects" : 10,
        	"nscanned" : 10,
        	"nscannedObjectsAllPlans" : 12,
        	"nscannedAllPlans" : 15,
        	"scanAndOrder" : false,
        	"indexOnly" : false,
        	"nYields" : 0,
        	"nChunkSkips" : 0,
        	"millis" : 0,
        	"indexBounds" : {
        		"id1" : [
        			[
        				-Infinity,
        				10
        			]
        		]
        	},
        	"allPlans" : [
        		{
        			"cursor" : "BtreeCursor id1_1",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 10,
        			"nscanned" : 10,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0,
        			"indexBounds" : {
        				"id1" : [
        					[
        						-Infinity,
        						10
        					]
        				]
        			}
        		},
        		{
        			"cursor" : "BtreeCursor id2_1",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 1,
        			"nscanned" : 1,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0,
        			"indexBounds" : {
        				"id2" : [
        					[
        						1,
        						1
        					]
        				]
        			}
        		},
        		{
        			"cursor" : "BasicCursor",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 1,
        			"nscanned" : 1,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0
        		},
        		{
        			"cursor" : "Complex Plan",
        			"n" : 0,
        			"nscannedObjects" : 0,
        			"nscanned" : 3,
        			"nChunkSkips" : 0
        		}
        	],
        	"server" : "Jeffs-MacBook-Air.local:27017",
        	"filterSet" : false,
        	"stats" : {
        		"type" : "KEEP_MUTATIONS",
        		"works" : 11,
        		"yields" : 0,
        		"unyields" : 0,
        		"invalidates" : 0,
        		"advanced" : 1,
        		"needTime" : 9,
        		"needFetch" : 0,
        		"isEOF" : 1,
        		"children" : [
        			{
        				"type" : "FETCH",
        				"works" : 11,
        				"yields" : 0,
        				"unyields" : 0,
        				"invalidates" : 0,
        				"advanced" : 1,
        				"needTime" : 9,
        				"needFetch" : 0,
        				"isEOF" : 1,
        				"alreadyHasObj" : 0,
        				"forcedFetches" : 0,
        				"matchTested" : 1,
        				"children" : [
        					{
        						"type" : "IXSCAN",
        						"works" : 10,
        						"yields" : 0,
        						"unyields" : 0,
        						"invalidates" : 0,
        						"advanced" : 10,
        						"needTime" : 0,
        						"needFetch" : 0,
        						"isEOF" : 1,
        						"keyPattern" : "{ id1: 1.0 }",
        						"boundsVerbose" : "field #0['id1']: [-inf.0, 10.0]",
        						"isMultiKey" : 0,
        						"yieldMovedCursor" : 0,
        						"dupsTested" : 0,
        						"dupsDropped" : 0,
        						"seenInvalidated" : 0,
        						"matchTested" : 0,
        						"keysExamined" : 10,
        						"children" : [ ]
        					}
        				]
        			}
        		]
        	}
        }
         
        > db.foo.find({ id1:{ $lte:10 }, id2:1 }, { id1:1, id2:1}).explain(true)
        {
        	"cursor" : "BasicCursor",
        	"isMultiKey" : false,
        	"n" : 1,
        	"nscannedObjects" : 10000,
        	"nscanned" : 10000,
        	"nscannedObjectsAllPlans" : 10003,
        	"nscannedAllPlans" : 10007,
        	"scanAndOrder" : false,
        	"indexOnly" : false,
        	"nYields" : 78,
        	"nChunkSkips" : 0,
        	"millis" : 11,
        	"allPlans" : [
        		{
        			"cursor" : "BasicCursor",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 10000,
        			"nscanned" : 10000,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0
        		},
        		{
        			"cursor" : "BtreeCursor id1_1",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 2,
        			"nscanned" : 3,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0,
        			"indexBounds" : {
        				"id1" : [
        					[
        						-Infinity,
        						10
        					]
        				]
        			}
        		},
        		{
        			"cursor" : "BtreeCursor id2_1",
        			"isMultiKey" : false,
        			"n" : 1,
        			"nscannedObjects" : 1,
        			"nscanned" : 1,
        			"scanAndOrder" : false,
        			"indexOnly" : false,
        			"nChunkSkips" : 0,
        			"indexBounds" : {
        				"id2" : [
        					[
        						1,
        						1
        					]
        				]
        			}
        		},
        		{
        			"cursor" : "Complex Plan",
        			"n" : 0,
        			"nscannedObjects" : 0,
        			"nscanned" : 3,
        			"nChunkSkips" : 0
        		}
        	],
        	"server" : "Jeffs-MacBook-Air.local:27017",
        	"filterSet" : false,
        	"stats" : {
        		"type" : "PROJECTION",
        		"works" : 10002,
        		"yields" : 78,
        		"unyields" : 78,
        		"invalidates" : 0,
        		"advanced" : 1,
        		"needTime" : 0,
        		"needFetch" : 0,
        		"isEOF" : 1,
        		"children" : [
        			{
        				"type" : "COLLSCAN",
        				"works" : 10002,
        				"yields" : 78,
        				"unyields" : 78,
        				"invalidates" : 0,
        				"advanced" : 1,
        				"needTime" : 10000,
        				"needFetch" : 0,
        				"isEOF" : 1,
        				"docsTested" : 10000,
        				"children" : [ ]
        			}
        		]
        	}
        }

        Show
        jlee Jeff lee (Inactive) added a comment - Sure thing...here ya go. Note that I reduced the size of the collection and simplified the projection. db.foo.ensureIndex({ id1:1 }) db.foo.ensureIndex({ id2:1 }) for( var i=1; i<= 10000; i++ ){db.foo.save({ id1:i, id2:i })}   > db.foo.find({ id1:{ $lte:10 }, id2:1 }).explain(true) { "cursor" : "BtreeCursor id1_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 10, "nscanned" : 10, "nscannedObjectsAllPlans" : 12, "nscannedAllPlans" : 15, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "id1" : [ [ -Infinity, 10 ] ] }, "allPlans" : [ { "cursor" : "BtreeCursor id1_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 10, "nscanned" : 10, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "id1" : [ [ -Infinity, 10 ] ] } }, { "cursor" : "BtreeCursor id2_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 1, "nscanned" : 1, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "id2" : [ [ 1, 1 ] ] } }, { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 1, "nscanned" : 1, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0 }, { "cursor" : "Complex Plan", "n" : 0, "nscannedObjects" : 0, "nscanned" : 3, "nChunkSkips" : 0 } ], "server" : "Jeffs-MacBook-Air.local:27017", "filterSet" : false, "stats" : { "type" : "KEEP_MUTATIONS", "works" : 11, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 1, "needTime" : 9, "needFetch" : 0, "isEOF" : 1, "children" : [ { "type" : "FETCH", "works" : 11, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 1, "needTime" : 9, "needFetch" : 0, "isEOF" : 1, "alreadyHasObj" : 0, "forcedFetches" : 0, "matchTested" : 1, "children" : [ { "type" : "IXSCAN", "works" : 10, "yields" : 0, "unyields" : 0, "invalidates" : 0, "advanced" : 10, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "keyPattern" : "{ id1: 1.0 }", "boundsVerbose" : "field #0['id1']: [-inf.0, 10.0]", "isMultiKey" : 0, "yieldMovedCursor" : 0, "dupsTested" : 0, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0, "keysExamined" : 10, "children" : [ ] } ] } ] } }   > db.foo.find({ id1:{ $lte:10 }, id2:1 }, { id1:1, id2:1}).explain(true) { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 10000, "nscanned" : 10000, "nscannedObjectsAllPlans" : 10003, "nscannedAllPlans" : 10007, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 78, "nChunkSkips" : 0, "millis" : 11, "allPlans" : [ { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 10000, "nscanned" : 10000, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0 }, { "cursor" : "BtreeCursor id1_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 2, "nscanned" : 3, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "id1" : [ [ -Infinity, 10 ] ] } }, { "cursor" : "BtreeCursor id2_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 1, "nscanned" : 1, "scanAndOrder" : false, "indexOnly" : false, "nChunkSkips" : 0, "indexBounds" : { "id2" : [ [ 1, 1 ] ] } }, { "cursor" : "Complex Plan", "n" : 0, "nscannedObjects" : 0, "nscanned" : 3, "nChunkSkips" : 0 } ], "server" : "Jeffs-MacBook-Air.local:27017", "filterSet" : false, "stats" : { "type" : "PROJECTION", "works" : 10002, "yields" : 78, "unyields" : 78, "invalidates" : 0, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "children" : [ { "type" : "COLLSCAN", "works" : 10002, "yields" : 78, "unyields" : 78, "invalidates" : 0, "advanced" : 1, "needTime" : 10000, "needFetch" : 0, "isEOF" : 1, "docsTested" : 10000, "children" : [ ] } ] } }
        Hide
        hari.khalsa@10gen.com Hari Khalsa (Inactive) added a comment -

        Thank you for reporting this! I actually submitted a fix the other day for this issue and it will be in RC1:

        https://github.com/mongodb/mongo/commit/a3d910925350d2f0204b41ea145e24f74e5c39ce

        In brief, we used to consider a collection scan as a possible way to execute the query even when an indexed plan was available. If a collection scan found as many matching documents as an indexed plan, we'd prefer the collection scan because it had both of the values we were projecting (and the index only had one of them).

        Show
        hari.khalsa@10gen.com Hari Khalsa (Inactive) added a comment - Thank you for reporting this! I actually submitted a fix the other day for this issue and it will be in RC1: https://github.com/mongodb/mongo/commit/a3d910925350d2f0204b41ea145e24f74e5c39ce In brief, we used to consider a collection scan as a possible way to execute the query even when an indexed plan was available. If a collection scan found as many matching documents as an indexed plan, we'd prefer the collection scan because it had both of the values we were projecting (and the index only had one of them).
        Hide
        jlee Jeff lee (Inactive) added a comment -

        Thanks Hari! I look forward to testing it out. There are some great new features in 2.6.

        This is probably a topic for another JIRA, but I'm curious how this change will affect situations where a collection scan would be preferred over the index scan.

        I'm imagining a scenario where the existing index is on a key with low cardinality, especially if there has been a lot of document movement and the query needs to hit disk. In this situation I would imagine that a full scan would be a better choice. I know that in previous versions, the first plan to return n documents (1001?) was ranked higher than others. Is this still the case in 2.6?

        Show
        jlee Jeff lee (Inactive) added a comment - Thanks Hari! I look forward to testing it out. There are some great new features in 2.6. This is probably a topic for another JIRA, but I'm curious how this change will affect situations where a collection scan would be preferred over the index scan. I'm imagining a scenario where the existing index is on a key with low cardinality, especially if there has been a lot of document movement and the query needs to hit disk. In this situation I would imagine that a full scan would be a better choice. I know that in previous versions, the first plan to return n documents (1001?) was ranked higher than others. Is this still the case in 2.6?

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:
              Days since reply:
              1 year, 17 weeks, 6 days ago
              Date of 1st Reply: