[SERVER-14264] Compound index on 2dsphere and datetime very slow in 2.6.1 and different result than in 2.4.10 Created: 16/Jun/14 Updated: 14/Apr/16 Resolved: 13/Feb/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.6.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Fabian Fülling [X] | Assignee: | Siyuan Zhou |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Sprint: | RPL 0 3/13/15 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
I have a collection with documents, which contain a datetime field 'from' (and also 'to', but that was not used in queries) and multiple polygons. Until Version 2.4.10 I solved that by having an array of geometry subdocuments in "location.geometry".
Version 2.6 supports MultiPolygons and it seems, that my workaround of manual "multi polygons" does not work anymore. The query times are much slower now. The pure geo query: (on 2.6.1 the geo field name is "geometry")
Both versions return 4599 results. But when I query for the date and the location, it get's problematic. The full query is: (on 2.6.1 the geo field name is "geometry")
On 2.4.10 the query takes only 49 ms:
But on 2.6.1 the query takes ~200ms and the number of results differs a lot:
The indexBounds look a bit strange. If I instead query just for a date (with an index on
What is the correct way, to index and query a collection for a specific date and geo location? Thanks a lot! |
| Comments |
| Comment by Siyuan Zhou [ 13/Feb/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Closing as a dup of | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 13/Feb/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you very much for pointing out this problem and providing the data. We're able to reproduce this issue and we can confirm that this is caused by the different behavior of 2.4 and 2.6 (also 3.0). The bounds of "from" can be improved to intersect "$gte" and "$lt", as you mentioned. Unfortunately, we don't have that yet. Thanks, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 18/Sep/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, My apologies for the delay in getting to this. My colleague siyuan.zhou@10gen.com is currently working on performance of geo queries, so I'm reassigning this to him. Please continue to watch this ticket for updates, as we hope to make some progress soon. Best, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 11/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks FabFuel. We have received the data successfully, and will update you when we have more info. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Fabian Fülling [X] [ 11/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi david.storch, I have uploaded the dataset. Best | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 11/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, I have re-opened the ticket and sent you instructions for transferring the data set. I'm going to put this in the "Waiting for User Input" state pending the data. Thanks for your help on this! Best, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Fabian Fülling [X] [ 09/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi david.storch, sorry for the delay! The issue is still present, it has nothing to do with the write command, it's just about reading or using the indexes. I have downloaded 2.6.3 and 2.4.10, started fresh mongod instances and did a benchmark with this query:
These are the results, which can be reproduced: on 2.6.3
and 2.4.10 is upto 7x faster:
For me it seems, that the "from" condition is not used to reduce the dataset before using the S2NearCursor. Best PS: I'm not allowed to reopen the ticket | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 08/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, I have marked this ticket as resolved due to inactivity. If you would like to keep working with us to diagnose this issue, please do not hesitate to reopen. Best, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 07/Aug/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, We haven't heard from you in a while, so I just wanted to check in. Are you still affected by this issue? Would you be able to share a dataset with us as described above by my colleague Thomas? Best, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thomas Rueckstiess [ 24/Jul/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Fabian, I've discussed this with David and we came to the conclusion that the repro script he posted is actually not a valid reproduction of the issue. The reason why the above script is slower in 2.6. is related to the introduction of the new write commands, which always wait for completion. This means the loop runs slower and 2.6 creates more distinct different date values. We confirmed this with the distinct command. In 2.4. we get
And in 2.6 we get
When we add a sleep(1) to the code after each insert, or randomize the dates, the difference is no longer there and we get roughly the same query times for both versions. This means we do not have a way to reproduce the insert difference you are describing. Is this still an issue for you? Are you able to share a test dataset with us that shows the slower queries on 2.6? We can arrange for the dataset to be shared privately if you like. Regards, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 18/Jun/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, Thanks for providing the additional information. I can reproduce the slower execution in 2.6.2 as opposed to 2.4.10 with the following simple script:
I can only reproduce the problem when the "from" values are dates---the repro does not seem to work when they are numbers. I am going to do some further investigation and get back to you when I have more info. Thanks, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Fabian Fülling [X] [ 17/Jun/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Dave, thank you very much for the quick reply! But the query is still 4x slower than in 2.4.10 and it seems like the btree is not used at all. This is the explain:
With the MultiPolygon structure form 2.6, the index is not multiKey. This is the index:
There is no other 2dsphere index in this collection. This is the query:
Thanks in advance | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 17/Jun/14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi FabFuel, Thanks very much for the bug report. It looks like you are running into a known issue that affects versions 2.6.0 and 2.6.1: see Just to confirm, the {from:1, 'location.geometry' : '2dsphere'} index is multikey, correct? Best, |