[SERVER-16384] 2d compound index results in a wider object scan for certain field name orderings Created: 02/Dec/14  Updated: 26/Sep/17  Resolved: 11/Jan/15

Status: Closed
Project: Core Server
Component/s: Geo, Querying
Affects Version/s: 2.6.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Anil Kumar Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:
  1. Generate data - 20K large objects and related 2d compound index

    function generateData(count, uniqueAppIds) {
    	appIds = []
    	for (i = 0; i < uniqueAppIds; ++i) {
    		appIds[i] = ObjectId();
    	}
     
    	dummyx = new Array(10000);
    	dummyx = dummyx.join("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
     
     
    	for (i = 0; i < count; i++) {
    		doc = {
    			extra_s: dummyx,
    			xid: appIds[Math.round(Math.random() * 9)],
    			pos: [Math.round(Math.random() * 360) - 180, Math.round(Math.random() * 180) - 90]
    		};
    		db.places.insert(doc);
    	}
     
    	/* Ensure Indices */
    	db.places.ensureIndex({pos: "2d", xid: 1});
    }
     
    function generateData(20000, 100)

  2. drop vm cache

    # on MacOS this would be 
    sudo purge

  3. Run the query that is expected to work fine

    db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), xix: 1}).explain()

    {
    	"cursor" : "GeoSearchCursor",
    	"isMultiKey" : false,
    	"n" : 0,
    	"nscannedObjects" : 0,
    	"nscanned" : 20000,
    	"nscannedObjectsAllPlans" : 0,
    	"nscannedAllPlans" : 20000,
    	"scanAndOrder" : false,
    	"indexOnly" : false,
    	"nYields" : 0,
    	"nChunkSkips" : 0,
    	"millis" : 88,
    	"indexBounds" : {
    		
    	},
    	"server" : "aks-osx.local:37018",
    	"filterSet" : false
    }

  4. Drop VM cache

    sudo purge

  5. Run query that does wider object scan than intended

    db.places.find({pos: { $nearSphere: [ 0, 0 ], $maxDistance: 125}, xid: ObjectId("547dbbe1290b9685990ca000"), cix: 1}).explain()

    {
    	"cursor" : "GeoSearchCursor",
    	"isMultiKey" : false,
    	"n" : 0,
    	"nscannedObjects" : 20000,
    	"nscanned" : 20000,
    	"nscannedObjectsAllPlans" : 20000,
    	"nscannedAllPlans" : 20000,
    	"scanAndOrder" : false,
    	"indexOnly" : false,
    	"nYields" : 0,
    	"nChunkSkips" : 0,
    	"millis" : 13024,
    	"indexBounds" : {
    		
    	},
    	"server" : "aks-osx.local:37018",
    	"filterSet" : false
    }

Participants:
Case:

 Description   

When using a compound 2d index, the query framework results in wider collection scan than intended for certain field name ordering in the query filters.

As seen in the repro output if there is an additional field in the query filter with field name that is lexicographically lower than the non-geo field that is part of the compound 2d index, it will result in a wider object scan (all objects will be loaded after applying the Geo-only filter).

There are couple of alternatives like using 2dsphere index, however for people moving from 2.4 to 2.6, this is a surprising change that works well in 2.4 (and works well again in 2.8)



 Comments   
Comment by Daniel Pasette (Inactive) [ 11/Jan/15 ]

This behavior was fixed in 2.7, and it is not possible to backport a fix to 2.6.

Generated at Thu Feb 08 03:40:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.