[SERVER-17851] $geoWithin and $geoIntersects do not match the query polygon's border Created: 01/Apr/15 Updated: 28/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Geo |
| Affects Version/s: | 2.6.8 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Alex Olieman | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | qi-geo, query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Assigned Teams: |
Query Integration
|
||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Steps To Reproduce: | 1. insert linestring into test collection
2. create a 2dsphere index
3. query the index with a polygon
|
||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
The $geoWithin and $geoIntersects operators do not retrieve (parts of) the border of the polygon that they are queried with. This doesn't make sense, and the docs seem to agree:
This issue presents itself when all coordinate pairs of a linestring intersect (i.e. are part of) the polygon border. When one of the line's points on the border is moved to the inside of the polygon, that line is retrieved by the query as expected. I would be surprised if this issue is caused by floating point precision or the spherical reference system. |
| Comments |
| Comment by Alex Olieman [ 10/Apr/15 ] |
|
Hi siyuan.zhou@10gen.com, thank you for your explanation of how this behavior originates from S2. It is correct that the behavior I'm asking for corresponds to DE-9IM "covers" rather than "contains". In that sense, "contains" seems to be working correctly in MongoDB, but the documentation erroneously explains it as the "covers" behavior. As I understand, there is a fundamental problem with supporting intersection and containment of arbitrary edge sections with a polygon. It would probably involve some effort in adapting S2, which would be a bad investment considering that S2 hasn't been maintained since 2011. It seems, however, that the majority of real-world "shared edge" cases involve what you call exact edges. The main use case I'm referring to is to compute sensible unions of polygons, which is a common task in spatial computation (Davis, 2007). If I were using S2 directly, I would use the s2polygonbuilder for this task. But regardless of how any MongoDB user chooses to compute such unions, we need to be able to retrieve sensible "merge candidates" using an index. For the general case it would be sufficient to implement "intersects" as DE-9IM defines it, although more specific cases such as mine would benefit from the "covers" predicate. To illustrate this with an OpenStreetMap example: the goal is to merge polygons A, B, and C, but to exclude polygons D and E from the union. We are looking for merge candidates with A, because it has a smaller ObjectID and we have no prior information on which polygons to merge. A query that complies with DE-9IM intersects() retrieves A, B, C, D, and E. This is a good start, but leaves further narrowing down up to the user. If we were also able to query with a $geoCovers operator, but which only matches on exact shared edges, we would find that A only shares an edge with B, and subsequently that B shares an edge with C. In this sense, matching only polygons that share an exact edge could be a feature, rather than a workaround. |
| Comment by Siyuan Zhou [ 09/Apr/15 ] |
|
Hi AliOli, thanks a lot for your comment. Unfortunately, current implementation cannot guarantee a polygon contains or intersects its edge. We may cover the special case where the exact edge intersects with / is contained by the polygon, but it's hard to guarantee a part of edge has the same behavior. Geometry relationship testing is based on S2 library, which treats degenerate in a special way as explained in SERVER-13599. To support this desired behavior, we have to make some fundamental changes. Technically, instead of contains(), covers() is a better description of this behavior, according to DE-9IM. Cleaning dataset is an interesting use case, but we still need to gather more user feedback to justify the design. I really appreciate your pointing out this issue. We'll update the documentation to clear up the confusion. |
| Comment by Alex Olieman [ 02/Apr/15 ] |
|
Hi siyuan.zhou@10gen.com, of course: I'm working with an OpenStreetMap dataset which incorporates geographical features at a very detailed level. My use case is location detection in text, and visualizing the mentioned locations for a user. To do that, the level of detail in OSM is far to high. Most streets, for instance, consist of many segments (i.e. Ways in OSM), which have the same name but are not explicitly linked in the source data. To work with this data, it needs to be simplified (i.e. aggregated and abstracted where possible). The $geoWithin operator can thus be used to check if a geographical feature is contained within another geographical feature. If that is the case, the larger geometry can be kept, and its constituent parts can be thrown out. Herald Square, for example, is found as two geographical features in OSM: the entire square, and the park area that it contains. For any application where such level of detail is not required, it would be wonderful if we could use MongoDB to find all geometry that is contained within the square, and remove it from the collection. In some countries it is very common to have a street surrounding the square with the same name as the square. The issue I've reported here prevents me from simplifying this kind of geometry without a workaround. And in general: recognizing parts of a geometry as being "within" that geometry would be good for logical consistency. In the current situation, a polygon is retrieved in response to a $geoWithin query with its own geometry, but its borders are not. So, a polygon is within itself, but its edges are not? If there is a severe technical reason for this behavior to stay the same, I think it deserves a good explanation in the docs. |
| Comment by Siyuan Zhou [ 02/Apr/15 ] |
|
Hi AliOli, 2dsphere index still uses geohashing and b-tree, however, this issue is more relevant to how MongoDB deals with degenerate cases rather than indexing. We're still investigating this issue. Could you tell us more about your use case where a polygon should contain its edges? |
| Comment by Alex Olieman [ 02/Apr/15 ] |
|
Hi ramon.fernandez, thanks for confirming the bug. Does the 2dsphere index v2 still use geohashes in a b-tree? Or have you moved on to r-trees or similar to index the geometries? I'm willing to think along, but would need to know some more context. Cheers, |
| Comment by Ramon Fernandez Marina [ 02/Apr/15 ] |
|
Thanks for your report AliOli, we can reproduce the behavior you describe and are looking into it. This is a very specific corner case, but before we jump into a naive fix we need to investigate the issue in detail and make sure we don't create assorted other bugs. Stay tuned. Cheers, |