[SERVER-1648] When searching with bounding box query, all results are not returned. Created: 20/Aug/10  Updated: 12/Jul/16  Resolved: 01/Jul/11

Status: Closed
Project: Core Server
Component/s: Geo
Affects Version/s: None
Fix Version/s: 1.9.1

Type: Bug Priority: Major - P3
Reporter: Raine Lightner Assignee: Greg Studer
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File 11.png     PNG File mapissue.png     PNG File mapissue.png    
Operating System: ALL
Participants:

 Description   

The query count says 78569 but the actual number returned is 56680.

http://db.tt/TdExxmq

log for c# driver:

09:39:21 [conn36] query MapItems.Standard reslen:4194356 nscanned:
56680 { LatLng: { $within:

{ $box: [ [ 40.66964021047412, -124.189453125 ], [ 49.23275712290425, -112.060546875 ] ] }

} }
nreturned:56680 1435ms

log for mongo.exe

09:41:24 [conn4] query MapItems.Standard reslen:264 nscanned:78569
{ query: { LatLng: { $within:

{ $box: [ [ 40.66964021047412, -124.189453125 ], [ 49.23275712290425, -112.060546875 ] ] }

} },
$explain: true } nreturned:1 1357ms



 Comments   
Comment by Greg Studer [ 06/Jul/11 ]

2.0 - we do even/odd/stable/unstable. Can't really backport this since it requires the other major changes to be useful - you can easily enable getMore in 1.8, but it will be very slow and memory hungry in cases like yours.

Comment by Raine Lightner [ 06/Jul/11 ]

When will this show up in a production build, do you know?

Comment by Greg Studer [ 06/Jul/11 ]

The nightly won't update until our integration tests run successfully - there were a bunch of patches pushed for 1.9.1 which probably triggered some issues.

Comment by Raine Lightner [ 05/Jul/11 ]

Nightly finally updated to 7/5.

I can confirm that this has fixed the issue with the GetMore not occuring. Thanks

-Raine

Comment by Raine Lightner [ 01/Jul/11 ]

Nightly still hasn't updated since 6/27.. kinda misleading.

Comment by Greg Studer [ 01/Jul/11 ]

Closing again, just to clean up 1.9.1 tickets - again, reopen if not fixed in 1.9.1 or a later build.

Comment by Raine Lightner [ 30/Jun/11 ]

I don't have the setup to create a build so I'll have to wait for that "nightly" that's not so "nightly"

-Raine

Comment by Greg Studer [ 30/Jun/11 ]

Yeah went in on 6/28 - have an enhanced test that adds 1-10 fields of diff types, seems to work fine too. It'll be in 1.9.1 if nothing else - not sure when the bleeding-edge builds are put online offhand.

Comment by Raine Lightner [ 30/Jun/11 ]

The exe build is from 6/27 so not the latest latest it looks like.

Comment by Raine Lightner [ 30/Jun/11 ]

I wonder if you changed your document to have more info in it since it might be the size of the document that effects this issue.

-Raine

Comment by Greg Studer [ 30/Jun/11 ]

What's the git version of your nightly?

The test here is the check for getMore - should complete successfully if you version supports this:
https://raw.github.com/mongodb/mongo/master/jstests/slowNightly/geo_mnypts.js

can run it using "buildscripts/smoke.py --mode=files jstests/slowNightly/geo_mnypts.js" or (maybe easier)

> mongod & mongo
in the shell:
load('geo_mnypts.js')

If it does complete successfully, there's something else we're missing, if not, the changes probably haven't gone into the nightly.

Comment by Raine Lightner [ 30/Jun/11 ]

I tried both the 1.9 and the latest nightly under 1.9 and its still showing the same issue.

-Raine

Comment by Greg Studer [ 28/Jun/11 ]

you were absolutely correct previously, getMore somehow was never enabled with the new incremental $within. Sorry about the confusion, thanks for your help tracking this down. Some testing indicates the fix unfortunately won't help you much if backported to 1.8, since the older logic pulls in all neighbor-box points to memory at once, which, for large queries, can take a long time.

The fix should now be in the latest build of 1.9, if you'd like to verify it on your end, and will be in 1.9.1 when that is released (soon).

Feel free to reopen if you continue to have problems.

Comment by auto [ 28/Jun/11 ]

Author:

{u'login': u'gregstuder', u'name': u'gregs', u'email': u'greg@10gen.com'}

Message: enable large $within queries for newer incremental return SERVER-1648
Branch: master
https://github.com/mongodb/mongo/commit/ddc9c33d2dcfcb34a0646e3fdced74a79d948d35

Comment by Raine Lightner [ 27/Jun/11 ]

Nope, a bounding box is a bounding box.. it seems if the amount of data is > some arbitrary limit it doesn't return all the results.

I repaired the DB and redid the indexes and go the same results.

No fields that purposely have that name.

-Raine

Comment by Greg Studer [ 27/Jun/11 ]

Hmm... that is weird, are there any messages in mongod? Talking with Jared, are there regions over which this query works correctly?

Another thing to try may be to rebuild the index, if you haven't already (and this isn't a live server) - just to rule out upgrades as a source of the problem.

And as a long shot, could any of your documents have a field named "$err"?

Comment by Raine Lightner [ 27/Jun/11 ]

Mongo Server version 1.8.2. Using the latest 10gen source code from github.

> db.Standard2.find({ "LL" : { "$within" :

{ "$box" : [[20.75324, -136.40625], [42.16970, -111.09375]] }

}, "Arch" : false },

{"Id":1, "WptId":1, "Avail":1, "SubscrOnly":1, "PxXY":1, "gH":1}

).count()
134056

> db.Standard2.find({ "LL" : { "$within" :

{ "$box" : [[20.75324, -136.40625], [42.16970, -111.09375]] }

}, "Arch" : false },

{"Id":1, "WptId":1, "Avail":1, "SubscrOnly":1, "PxXY":1, "gH":1}

).itcount()
65536

> db.Standard2.find({ "LL" : { "$within" :

{ "$box" : [[20.75324, -136.40625], [42.16970, -111.09375]] }

}, "Arch" : false },

{"Id":1, "WptId":1, "Avail":1, "Su bscrOnly":1, "PxXY":1, "gH":1}

).explain()
{
"cursor" : "GeoBrowse-box",
"nscanned" : 134056,
"nscannedObjects" : 134056,
"n" : 134056,
"millis" : 1468,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {

}
}

Comment by Raine Lightner [ 27/Jun/11 ]

> db.Standard2.find({ "LL" : { "$within" :

{ "$box" : [[20.753242249926792, -136.40625], [42.169701353131515, -111.09375000000001]] }

}, "Arch" : false } ).count()
134056
> db.Standard2.find({ "LL" : { "$within" :

{ "$box" : [[20.753242249926792, -136.40625], [42.169701353131515, -111.09375000000001]] }

}, "Arch" : false } ).itcount()
12200

Comment by Greg Studer [ 27/Jun/11 ]

Hmm... haven't seen an issue with fields returned affecting results before - do you get the same results when you run the query from the mongo shell (using count() and itcount() - itcount() iterates through all the results explicitly, like your loop variable) - this will narrow down the issue to the core server or something in the C# driver. Without SetFields, does the issue go away?

Just as a general question, is the correct # of docs to return here 104,206? And since it isn't clear from before, what version of mongodb are you using now, and what's your C# driver version? - you can fill in the JIRA case if you like.

A good test case might be to start from a single point, then expand the search box slowly until you see the problems occurring. If no problems occur, start adding fields until you see a difference. It'd be helpful to see the actual query and ensureIndex calls too, along with any suspicious entries in the mongod/mongos log during the time you're returning results.

Comment by Raine Lightner [ 27/Jun/11 ]

I guess I've narrowed this bug down to a lack of data being returned.

I'm asking for only certain columns to be returned:

"Id", "WPTId", "Avail", "SubcrOnly", "PxXY", "gH" and the "Count" returned for one query is 104,206 yet if I increment a variable in a loop that works with that data only 48,210 items are actually returned.

If I execute the same query and tack on two more clumns "Diff" and "Terr" the Count is the same but the loop variable is only 39,199.

The setFields part:

cursor.SetFields("Id", "WPTId", "Avail", "SubcrOnly", "PxXY", "gH", "Diff", "Terr");

This is the data structure being returned:
Id INT 4
WptId INT 4
Avail BOOL 1
SubcrOnly BOOL 1
PxXY INT[] 4 (Holds an X,Y value)
gH String 8
Diff INT 4
Terr INT 4

Thoughts?
-Raine

Comment by Greg Studer [ 24/Jun/11 ]

getMore isn't geo specific, but if you're asking whether all the $box results are found at once or incrementally, $within queries are mostly incremental in 1.8 (though there are cases where all the points in a neighbor region are saved to memory at once, which can lead to slowdowns depending on your point distribution).

In 1.9.1 $within queries are entirely incremental and return small batches of points at a time.

Comment by Raine Lightner [ 23/Jun/11 ]

Does $box support getMore?

Comment by Greg Studer [ 01/Mar/11 ]

If you are still having this issue, can you provide another link to the test data? The dropbox link was/is still down. As it stands we can't really start debugging the issue without more information.

Comment by Raine Lightner [ 19/Nov/10 ]

This still appears to be an issue in the latest nightly

Comment by Mathias Stearn [ 12/Nov/10 ]

Could you rerun your test with the latest 1.7 nightly? $box queries have been almost completely rewritten since the bug was filed so it may already be solved. If it is still an issue, it would be very helpfull if you could provide your test data. The dropbox link appears to be dead. Attaching the file would probably be best.

Comment by Raine Lightner [ 20/Aug/10 ]

Area in red missing results due to not being returned

Generated at Thu Feb 08 02:57:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.