Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16384

2d compound index results in a wider object scan for certain field name orderings

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Won't Fix
    • 2.6.5
    • None
    • Geo, Querying
    • None
    • Fully Compatible
    • ALL
    • Hide
      1. Generate data - 20K large objects and related 2d compound index

        function generateData(count, uniqueAppIds) {
        	appIds = []
        	for (i = 0; i < uniqueAppIds; ++i) {
        		appIds[i] = ObjectId();
        	}
         
        	dummyx = new Array(10000);
        	dummyx = dummyx.join("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
         
         
        	for (i = 0; i < count; i++) {
        		doc = {
        			extra_s: dummyx,
        			xid: appIds[Math.round(Math.random() * 9)],
        			pos: [Math.round(Math.random() * 360) - 180, Math.round(Math.random() * 180) - 90]
        		};
        		db.places.insert(doc);
        	}
         
        	/* Ensure Indices */
        	db.places.ensureIndex({pos: "2d", xid: 1});
        }
         
        function generateData(20000, 100)

      2. drop vm cache

        # on MacOS this would be 
        sudo purge

      3. Run the query that is expected to work fine

        db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), xix: 1}).explain()

        {
        	"cursor" : "GeoSearchCursor",
        	"isMultiKey" : false,
        	"n" : 0,
        	"nscannedObjects" : 0,
        	"nscanned" : 20000,
        	"nscannedObjectsAllPlans" : 0,
        	"nscannedAllPlans" : 20000,
        	"scanAndOrder" : false,
        	"indexOnly" : false,
        	"nYields" : 0,
        	"nChunkSkips" : 0,
        	"millis" : 88,
        	"indexBounds" : {
        		
        	},
        	"server" : "aks-osx.local:37018",
        	"filterSet" : false
        }

      4. Drop VM cache

        sudo purge

      5. Run query that does wider object scan than intended

        db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), cix: 1}).explain()

        {
        	"cursor" : "GeoSearchCursor",
        	"isMultiKey" : false,
        	"n" : 0,
        	"nscannedObjects" : 20000,
        	"nscanned" : 20000,
        	"nscannedObjectsAllPlans" : 20000,
        	"nscannedAllPlans" : 20000,
        	"scanAndOrder" : false,
        	"indexOnly" : false,
        	"nYields" : 0,
        	"nChunkSkips" : 0,
        	"millis" : 13024,
        	"indexBounds" : {
        		
        	},
        	"server" : "aks-osx.local:37018",
        	"filterSet" : false
        }

      Show
      Generate data - 20K large objects and related 2d compound index function generateData(count, uniqueAppIds) { appIds = [] for (i = 0; i < uniqueAppIds; ++i) { appIds[i] = ObjectId(); }   dummyx = new Array(10000); dummyx = dummyx.join( "ABCDEFGHIJKLMNOPQRSTUVWXYZ" );     for (i = 0; i < count; i++) { doc = { extra_s: dummyx, xid: appIds[Math.round(Math.random() * 9)], pos: [Math.round(Math.random() * 360) - 180, Math.round(Math.random() * 180) - 90] }; db.places.insert(doc); }   /* Ensure Indices */ db.places.ensureIndex({pos: "2d" , xid: 1}); }   function generateData(20000, 100) drop vm cache # on MacOS this would be sudo purge Run the query that is expected to work fine db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), xix: 1}).explain() { "cursor" : "GeoSearchCursor", "isMultiKey" : false, "n" : 0, "nscannedObjects" : 0, "nscanned" : 20000, "nscannedObjectsAllPlans" : 0, "nscannedAllPlans" : 20000, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 88, "indexBounds" : { }, "server" : "aks-osx.local:37018", "filterSet" : false } Drop VM cache sudo purge Run query that does wider object scan than intended db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), cix: 1}).explain() { "cursor" : "GeoSearchCursor", "isMultiKey" : false, "n" : 0, "nscannedObjects" : 20000, "nscanned" : 20000, "nscannedObjectsAllPlans" : 20000, "nscannedAllPlans" : 20000, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 13024, "indexBounds" : { }, "server" : "aks-osx.local:37018", "filterSet" : false }

    Description

      When using a compound 2d index, the query framework results in wider collection scan than intended for certain field name ordering in the query filters.

      As seen in the repro output if there is an additional field in the query filter with field name that is lexicographically lower than the non-geo field that is part of the compound 2d index, it will result in a wider object scan (all objects will be loaded after applying the Geo-only filter).

      There are couple of alternatives like using 2dsphere index, however for people moving from 2.4 to 2.6, this is a surprising change that works well in 2.4 (and works well again in 2.8)

      Attachments

        Activity

          People

            Unassigned Unassigned
            anil.kumar Anil Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: