[SERVER-16592] bad order search logic in Geonear Created: 18/Dec/14  Updated: 09/Jan/15  Resolved: 09/Jan/15

Status: Closed
Project: Core Server
Component/s: Geo, Index Maintenance, Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: guipulsar Assignee: Siyuan Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-16633 Add timer to geo near stage to track ... Closed
Operating System: ALL
Steps To Reproduce:

db.runCommand( { 
    
    geoNear: "tablegui",
    near: [2.298800, 48.854355], 
    spherical: true, 
    distanceMultiplier: 6371 ,     
     query: { code_postal : 2 },
     maxDistance:2000/6371  ,
    limit:1500 }
    );

Participants:

 Description   

Hi ,
When using geoNear, i add a conditional query(who match nothing) ,
but result is very long because the order is to first scan all geo matching..
This is illogical, it takes 0,00001 to see thats the conditional query return nothing, and it takes many time to search all geo distance .
Why don't search simple query filter before complex query filter ?

consider this query , tested with all possible indexation cases

db.runCommand( { 
    
    geoNear: "tablegui",
    near: [2.298800, 48.854355], 
    spherical: true, 
    distanceMultiplier: 6371 ,     
     query: { code_postal : 2 },
     maxDistance:2000/6371  ,
    limit:1500 }
    );

There is no code_postal value with 2 , the query take very long time to search distance first...
Somethings is wrong or i miss something ?

regards
pulsar



 Comments   
Comment by guipulsar [ 31/Dec/14 ]

if you consider its not a bug perhaps it could be a good idea to write some words about it in the doc because thats at least unclear from my point of vue

Comment by Siyuan Zhou [ 30/Dec/14 ]

Hi pulsar,

It's not a bug. The geo only search works by design, because running geo query with compound index { code_postal: 1 , loc : "2dsphere" } cannot use the index. The b-tree entries of compound index are ordered by code_postal first. If two documents have the same code_postal, then they will be ordered by loc using geohash. There is no total order of loc in b-tree, so geo query without code_postal doesn't know which portion of the b-tree should be scanned.

If you have any questions about geo query or MonogDB generally, the google group and stackoverflow is the best place to ask, since there are many active MongoDB users and experts in the community.

Comment by guipulsar [ 22/Dec/14 ]

i am on 2.6.6 ,
if you make e compund index with the code_postal or another in this order :
db.tablegui.ensureIndex(

{ code_postal: 1 , loc : "2dsphere"}

)
then perform this geo search without any query filter like this:
db.runCommand(

{ geoNear: "tablegui", near: [2.298800, 48.854355], spherical: true, distanceMultiplier: 6371 , maxDistance:2000/6371 , limit:1500 }

);

you will have an error :
/* 0 */
{
"ok" : 0,
"errmsg" : "can't get query runner"
}

I guess its a related bug..I'm facing many pb with GeoNear ,
see my other related report https://jira.mongodb.org/browse/SERVER-16594?filter=-2 , i guess something goes wrong but don't
find what exacly..

Comment by Siyuan Zhou [ 22/Dec/14 ]

pulsar, thanks for reporting this issue. What version are you using? I am able to reproduce this problem on 2.8.0-rc3 with the following $near query.

var coll = db.geoPerf;
 
var bulk = coll.initializeUnorderedBulkOp();
for (i = 0; i < 2000; i++) {
  // Point around 0, 0
  bulk.insert({loc: [Math.random(), Math.random()], zip_code: i % 100 });
}
bulk.execute();
 
coll.ensureIndex({loc: "2dsphere", zip_code: 1});
// coll.ensureIndex({zip_code: 1, loc: "2dsphere"});
 
// Searching the whole world. GeoJSON implies spherical query.
var ex = coll.find({
  zip_code: 2000,
  loc: {$near: { $geometry: {
    type: "Point", coordinates: [0, 0]
  }}}
}).explain(true);
printjson(ex);

The compound index { loc: "2dsphere", zip_code: 1 } is very slow in this case, but {zip_code: 1, loc: "2dsphere"} works as expected. From the explain(), it shows that index scan works much faster with the index on {zip_code: 1, loc: "2dsphere"}. As a workaround, building an index on {zip_code: 1, loc: "2dsphere"} should work for you.

PS: in 2.8.0-rc3, the execution time spent on geo near stage (and its sub-stages) is always shown as 0, this bug has been filed in SERVER-16633.

Comment by guipulsar [ 22/Dec/14 ]

actualy time consuming decreased with a compund index on loc and code_postal but still very slow, because it should be very speed to see thats code_postal match nothing right ? Here my actual output index with coresponding explain:

[
	{
		"v" : 1,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "basedetest.tablegui"
	},
	{
		"v" : 1,
		"key" : {
			"loc" : "2dsphere",
			"code_postal" : 1
		},
		"name" : "loc_2dsphere_code_postal_1",
		"ns" : "basedetest.tablegui",
		"2dsphereIndexVersion" : 2
	}
]
 
 
{
    "results" : [],
    "stats" : {
        "nscanned" : NumberLong(88210),
        "objectsLoaded" : NumberLong(88210),
        "avgDistance" : NaN,
        "maxDistance" : 0,
        "time" : 383
    },
    "ok" : 1
}

Comment by Ramon Fernandez Marina [ 21/Dec/14 ]

pulsar, can you please post the output of

db.tablegui.getIndexes()

?

Generated at Thu Feb 08 03:41:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.