[SERVER-57938] Skip polygon validation for stored GeoJSON when query has $geoIntersect and a 2dsphere index Created: 22/Jun/21  Updated: 29/Oct/23  Resolved: 14/Sep/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.10, 5.1.0-rc0, 5.0.10

Type: Improvement Priority: Major - P3
Reporter: Eric Cox (Inactive) Assignee: Eric Cox (Inactive)
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2021-06-21 at 10.57.24 AM.png    
Issue Links:
Backports
Problem/Incident
Related
is related to SERVER-20843 $geoIntersects performs poorly on pol... Backlog
is related to SERVER-15204 Skip validation for stored geometry i... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0, v4.4
Sprint: QE 2021-09-20
Participants:

 Description   

A CPU profile that was collected to investigate a slow $geoIntersect query when GeoJSON documents contained polygons with thousands of edges showed that we spend nearly 87.5% of the CPU time validating that the polygon is closed and inner loops represent "holes" (see attached flame chart). The profile spends the majority of the time in S2Loop:Contains().

The goal of this ticket is to implement the skipValidation flag that bypasses geometry validation when we execute $geoIntersect queries and there's a 2dsphere index on the stored geometries. There was work done under SERVER-15204 to skip validation but it didn't cover this case. Why can we do this? We already call GeometryContainer::parseFromStorage when getting s2 index keys, so we do this validation when generating index keys. 

The performance issue can be replicated  by downloading the dump.tgz attachment linked here. Then run mongorestore on a local mongod, and running the following queries.

use BTGST
db.Germany_bbox.find({'bbox.0': {$lt: 6.7767088},'bbox.1': {$lt: 51.2217392}, 'bbox.2': {$gt: 6.7767088}, 'bbox.3': {$gt: 51.2217392}, 'geometry': {'$geoIntersects': {'$geometry': {'type': 'Point','coordinates': [6.7767088,51.2217392]}}}}, {_id:0, 'properties.TARIFF_LAYER_ID':1 }).explain("executionStats")

Preliminary hacking showed that the latency of the FETCH stage went from ~4s to ~2s if we skipped validation during reads.



 Comments   
Comment by Githook User [ 10/Jul/22 ]

Author:

{'name': 'Eric Cox', 'email': 'eric.cox@mongodb.com', 'username': 'ericox'}

Message: SERVER-57938 Skip validation for stored geometry if a 2dsphere index exists

cherry-picks 85cf0f4bdb5763514df90653a677cf8fa0100305
Branch: v5.0
https://github.com/mongodb/mongo/commit/39ef61c583b47d764872acc9f3d4c4c168377b5b

Comment by Githook User [ 17/Sep/21 ]

Author:

{'name': 'Eric Cox', 'email': 'eric.cox@mongodb.com', 'username': 'ericox'}

Message: SERVER-57938 Skip validation for stored geometry if a 2dsphere index exists

(cherry picked from commit 85cf0f4bdb5763514df90653a677cf8fa0100305)
Branch: v4.4
https://github.com/mongodb/mongo/commit/84054e82ceeab48882e3ce764b1b9e0d4a5287d6

Comment by Githook User [ 14/Sep/21 ]

Author:

{'name': 'Eric Cox', 'email': 'eric.cox@mongodb.com', 'username': 'ericox'}

Message: SERVER-57938 Skip validation for stored geometry if a 2dsphere index exists
Branch: master
https://github.com/mongodb/mongo/commit/85cf0f4bdb5763514df90653a677cf8fa0100305

Comment by Ethan Zhang (Inactive) [ 26/Jul/21 ]

Very cool, thanks charlie.swanson

CC anand.sanghani

Comment by Charlie Swanson [ 26/Jul/21 ]

ethan.zhang this was chosen as a quick win for the upcoming quarter. I can't give you any better timeline than that, but we hope to give it a shot in the next ~3 months as we have some spare cycles between larger projects. 

Comment by Ethan Zhang (Inactive) [ 26/Jul/21 ]

Hi charlie.swanson jacob.evans james.wahlin , I am curious about what QO has decided to do for this ticket?

Comment by David Storch [ 29/Jun/21 ]

I do agree that the work is primarily QO-related, although QE could also do the work if it is more convenient from a scheduling standpoint.

Comment by Ethan Zhang (Inactive) [ 28/Jun/21 ]

This might have a lot of work in QO, and it better be a QO SERVER ticket?

Generated at Thu Feb 08 05:43:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.