<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:19:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-9064] Count of Geo Querys of Millions of Documents Extremely Slow</title>
                <link>https://jira.mongodb.org/browse/SERVER-9064</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I have a 5M record dataset including Geo locations upon which I want to search to find points within a selected area.  &lt;/p&gt;

&lt;p&gt;The operation yields results quickly but I need a count of the points in the area as well.  If I make a selection of a city or entire state on this data the count query takes 45 seconds or more to execute.&lt;/p&gt;

&lt;p&gt;There is a similar problem if I want to sort the results of a find (NOT a count in this case) regardless of indexing.&lt;/p&gt;

&lt;p&gt;This thread has additional info&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://groups.google.com/forum/#!searchin/mongodb-user/geo$20performance/mongodb-user/UQNiXAHZHP0/2nu9unVFQ3QJ&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://groups.google.com/forum/#!searchin/mongodb-user/geo$20performance/mongodb-user/UQNiXAHZHP0/2nu9unVFQ3QJ&lt;/a&gt;&lt;/p&gt;</description>
                <environment>Linux (EC2)&lt;br/&gt;
OSX </environment>
        <key id="69316">SERVER-9064</key>
            <summary>Count of Geo Querys of Millions of Documents Extremely Slow</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="2" iconUrl="https://jira.mongodb.org/images/icons/priorities/critical.svg">Critical - P2</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="hari.khalsa@10gen.com">hari.khalsa@10gen.com</assignee>
                                    <reporter username="mattx">Matthew Cross</reporter>
                        <labels>
                    </labels>
                <created>Thu, 21 Mar 2013 20:06:31 +0000</created>
                <updated>Wed, 10 Dec 2014 23:11:06 +0000</updated>
                            <resolved>Thu, 2 May 2013 15:02:24 +0000</resolved>
                                    <version>2.2.0</version>
                    <version>2.4.0</version>
                                                    <component>Geo</component>
                                        <votes>1</votes>
                                    <watches>7</watches>
                                                                                                                <comments>
                            <comment id="331411" author="mattx" created="Wed, 8 May 2013 20:46:13 +0000"  >&lt;p&gt;The problem here is that both the 2d and the 2dspherical indexes don&apos;t allow me to do counts within polygon in a way that lets my app work.  The data used for these queries is used for most of my webapp and due to mongo&apos;s locking characteristics, a few large geo queries will bring my site down.  Even for moderate numbers of countable points within a polygon, query times run up to 10 seconds.  When you do enough of those queries it&apos;s game over for my app.  I&apos;ve tried enabling slaveOK and might try sharding but in the end I&apos;m probably going to have to use a different store because of the 10x better performance.  I would really like to see this be a priority for you guys because this single factor is going to prevent me from using mongo at all and I can&apos;t be alone on that count.&lt;/p&gt;

&lt;p&gt;I&apos;m open to other suggestions for how I might manage my data to facilitate these queries.  I can&apos;t pre-compute the counts because they are based on live weather patterns overlaid upon location data that can also change.  Postgres and ElasticSearch would be a nice workaround but the nature of our queries is such that I can&apos;t use one of these alternatives without using it for everything.  I have even considered doing a MongoDB follow on $in query with the millions of IDs that come out of Postgres or ES but that just sounds insane.&lt;/p&gt;</comment>
                            <comment id="326641" author="hari.khalsa@10gen.com" created="Thu, 2 May 2013 15:02:12 +0000"  >&lt;p&gt;Hi!  Sorry for not getting back to this sooner.&lt;/p&gt;

&lt;p&gt;The 2dsphere isn&apos;t designed for highly efficient counting.  Without an index format change or a redesign I think it&apos;s not going to perform well for counting large amounts of documents.&lt;/p&gt;</comment>
                            <comment id="306303" author="mattx" created="Thu, 4 Apr 2013 22:11:40 +0000"  >&lt;p&gt;I will revisit my results from the user group thread but I was mostly referring to the much faster query using the $within-&amp;gt;$polygon with the standard &quot;2d&quot; index.&lt;/p&gt;</comment>
                            <comment id="305959" author="mattx" created="Thu, 4 Apr 2013 15:51:54 +0000"  >&lt;p&gt;Also note that I cannot hint the query to use the index.&lt;/p&gt;


&lt;p&gt;db.contact.find(/&lt;b&gt;...&lt;/b&gt;/).hint(&lt;/p&gt;
{&quot;address.gisLocation&quot;:1}
&lt;p&gt;);&lt;br/&gt;
error: &lt;/p&gt;
{ &quot;$err&quot; : &quot;bad hint&quot;, &quot;code&quot; : 10113 }

&lt;p&gt;db.contact.getIndexes()&lt;br/&gt;
[&lt;br/&gt;
	{&lt;br/&gt;
		&quot;v&quot; : 1,&lt;br/&gt;
		&quot;key&quot; : &lt;/p&gt;
{
			&quot;_id&quot; : 1
		}
&lt;p&gt;,&lt;br/&gt;
		&quot;ns&quot; : &quot;us-contact2.contact&quot;,&lt;br/&gt;
		&quot;name&quot; : &quot;&lt;em&gt;id&lt;/em&gt;&quot;&lt;br/&gt;
	},&lt;br/&gt;
	{&lt;br/&gt;
		&quot;v&quot; : 1,&lt;br/&gt;
		&quot;key&quot; : &lt;/p&gt;
{
			&quot;address.gisLocation&quot; : &quot;2dsphere&quot;
		}
&lt;p&gt;,&lt;br/&gt;
		&quot;ns&quot; : &quot;us-contact2.contact&quot;,&lt;br/&gt;
		&quot;name&quot; : &quot;address.gisLocation_2dsphere&quot;&lt;br/&gt;
	}&lt;br/&gt;
]&lt;/p&gt;</comment>
                            <comment id="305958" author="hari.khalsa@10gen.com" created="Thu, 4 Apr 2013 15:51:41 +0000"  >&lt;p&gt;Hi.&lt;/p&gt;

&lt;p&gt;Here is the original query you reported:&lt;/p&gt;

&lt;p&gt;&amp;gt; db.contact2.find({geo_2: { $within:&lt;br/&gt;
...                              { $geometry: &lt;/p&gt;
{ &quot;type&quot;: &quot;Polygon&quot;,
...                                             &quot;coordinates&quot;: [[ 
...                                 [-88.352051,30.977609],
...                                 [-81.320801,30.694612],
...                                 [-79.431152,25.383735],
...                                 [-81.013184,24.487149],
...                                 [-82.990723,27.741885],
...                                 [-83.540039,29.401320],
...                                 [-88.308105,30.126124],
...                                 [-88.352051,30.977609]
...                                 ]]
...                              }
&lt;p&gt; &lt;br/&gt;
...                              }&lt;br/&gt;
...                              } }).explain();&lt;br/&gt;
{&lt;br/&gt;
	&quot;cursor&quot; : &quot;S2Cursor&quot;,&lt;br/&gt;
	&quot;isMultiKey&quot; : true,&lt;br/&gt;
	&quot;n&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedObjects&quot; : 4393932,&lt;br/&gt;
	&quot;nscanned&quot; : 8807676,&lt;br/&gt;
	&quot;nscannedObjectsAllPlans&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedAllPlans&quot; : 8807676,&lt;br/&gt;
	&quot;scanAndOrder&quot; : false,&lt;br/&gt;
	&quot;indexOnly&quot; : false,&lt;br/&gt;
	&quot;nYields&quot; : 81,&lt;br/&gt;
	&quot;nChunkSkips&quot; : 0,&lt;br/&gt;
	&quot;millis&quot; : 83783,&lt;br/&gt;
	&quot;indexBounds&quot; : {&lt;/p&gt;

&lt;p&gt;	},&lt;br/&gt;
	&quot;nscanned&quot; : 8807676,&lt;br/&gt;
	&quot;matchTested&quot; : NumberLong(4413744),&lt;br/&gt;
	&quot;geoTested&quot; : NumberLong(4413744),&lt;br/&gt;
	&quot;cellsInCover&quot; : NumberLong(21),&lt;br/&gt;
	&quot;server&quot; : &quot;localhost:27017&quot;&lt;br/&gt;
}&lt;/p&gt;

&lt;p&gt;If you compare this to one of your recently-reported queries, you&apos;ll see the run time actually got smaller: millis was 83783, now is 66435.  So, I&apos;m not sure how your results are worse than before.  Can you elaborate?&lt;/p&gt;

&lt;p&gt;PS. Any time you see &apos;S2Cursor&apos; it&apos;s using the 2dsphere index.&lt;/p&gt;</comment>
                            <comment id="305953" author="mattx" created="Thu, 4 Apr 2013 15:44:51 +0000"  >&lt;p&gt;Tried again, first dropping all indexes and adding a single 2dsphere index on address.gisLocation.  Also changed my query to $geoWithin.  Results were marginally slower.&lt;/p&gt;

&lt;p&gt;real	1m18.714s&lt;/p&gt;

&lt;p&gt;{&lt;br/&gt;
	&quot;cursor&quot; : &quot;S2Cursor&quot;,&lt;br/&gt;
	&quot;isMultiKey&quot; : true,&lt;br/&gt;
	&quot;n&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedObjects&quot; : 4393932,&lt;br/&gt;
	&quot;nscanned&quot; : 8807600,&lt;br/&gt;
	&quot;nscannedObjectsAllPlans&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedAllPlans&quot; : 8807600,&lt;br/&gt;
	&quot;scanAndOrder&quot; : false,&lt;br/&gt;
	&quot;indexOnly&quot; : false,&lt;br/&gt;
	&quot;nYields&quot; : 4,&lt;br/&gt;
	&quot;nChunkSkips&quot; : 0,&lt;br/&gt;
	&quot;millis&quot; : 66435,&lt;br/&gt;
	&quot;indexBounds&quot; : {&lt;/p&gt;

&lt;p&gt;	},&lt;br/&gt;
	&quot;nscanned&quot; : 8807600,&lt;br/&gt;
	&quot;matchTested&quot; : NumberLong(4413668),&lt;br/&gt;
	&quot;geoTested&quot; : NumberLong(4413668),&lt;br/&gt;
	&quot;cellsInCover&quot; : NumberLong(21),&lt;br/&gt;
	&quot;server&quot; : &quot;Matthews-MacBook-Pro.local:30017&quot;&lt;br/&gt;
}&lt;/p&gt;</comment>
                            <comment id="305947" author="mattx" created="Thu, 4 Apr 2013 15:37:36 +0000"  >&lt;p&gt;I downloaded a 2.5 nightly build which has this change listed in the release notes.&lt;/p&gt;

&lt;p&gt;My results are worse than before.  Should I be seeing an improvement?&lt;/p&gt;

&lt;p&gt;db.contact.find({&quot;address.gisLocation&quot;: { $within:&lt;br/&gt;
                             { $geometry: &lt;/p&gt;
{ &quot;type&quot;: &quot;Polygon&quot;,
                                            &quot;coordinates&quot;: [[ 
                                [-88.352051,30.977609],
                                [-81.320801,30.694612],
                                [-79.431152,25.383735],
                                [-81.013184,24.487149],
                                [-82.990723,27.741885],
                                [-83.540039,29.401320],
                                [-88.308105,30.126124],
                                [-88.352051,30.977609]
                                ]]
                             }
&lt;p&gt; &lt;br/&gt;
                             }&lt;br/&gt;
                             } });&lt;/p&gt;

&lt;p&gt;time to run: real	1m3.799s&lt;/p&gt;

&lt;p&gt;Here is the explain (note I don&apos;t know how to explain a count).  Looks like it didn&apos;t use the 2dsphere index at all.&lt;/p&gt;

&lt;p&gt;{&lt;br/&gt;
	&quot;cursor&quot; : &quot;S2Cursor&quot;,&lt;br/&gt;
	&quot;isMultiKey&quot; : true,&lt;br/&gt;
	&quot;n&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedObjects&quot; : 4393932,&lt;br/&gt;
	&quot;nscanned&quot; : 8807602,&lt;br/&gt;
	&quot;nscannedObjectsAllPlans&quot; : 4393932,&lt;br/&gt;
	&quot;nscannedAllPlans&quot; : 8807602,&lt;br/&gt;
	&quot;scanAndOrder&quot; : false,&lt;br/&gt;
	&quot;indexOnly&quot; : false,&lt;br/&gt;
	&quot;nYields&quot; : 6,&lt;br/&gt;
	&quot;nChunkSkips&quot; : 0,&lt;br/&gt;
	&quot;millis&quot; : 66305,&lt;br/&gt;
	&quot;indexBounds&quot; : {&lt;/p&gt;

&lt;p&gt;	},&lt;br/&gt;
	&quot;nscanned&quot; : 8807602,&lt;br/&gt;
	&quot;matchTested&quot; : NumberLong(4413670),&lt;br/&gt;
	&quot;geoTested&quot; : NumberLong(4413670),&lt;br/&gt;
	&quot;cellsInCover&quot; : NumberLong(21),&lt;br/&gt;
	&quot;server&quot; : &quot;Matthews-MacBook-Pro.local:30017&quot;&lt;br/&gt;
}&lt;/p&gt;


&lt;p&gt;db.contact.find({&quot;address.gisLocation&quot; : {&quot;$within&quot; : &lt;/p&gt;
{ &quot;$polygon&quot; : [
								[-88.352051,30.977609],
                                [-81.320801,30.694612],
                                [-79.431152,25.383735],
                                [-81.013184,24.487149],
                                [-82.990723,27.741885],
                                [-83.540039,29.401320],
                                [-88.308105,30.126124]
                        ]
                }
&lt;p&gt;          }&lt;br/&gt;
      });&lt;/p&gt;

&lt;p&gt;time to run: real	0m36.326s &lt;/p&gt;

&lt;p&gt;... and here is the explain for this query.&lt;/p&gt;

&lt;p&gt;{&lt;br/&gt;
	&quot;cursor&quot; : &quot;GeoBrowse-polygon&quot;,&lt;br/&gt;
	&quot;isMultiKey&quot; : false,&lt;br/&gt;
	&quot;n&quot; : 4392328,&lt;br/&gt;
	&quot;nscannedObjects&quot; : 4392328,&lt;br/&gt;
	&quot;nscanned&quot; : 4392328,&lt;br/&gt;
	&quot;nscannedObjectsAllPlans&quot; : 4392328,&lt;br/&gt;
	&quot;nscannedAllPlans&quot; : 4392328,&lt;br/&gt;
	&quot;scanAndOrder&quot; : false,&lt;br/&gt;
	&quot;indexOnly&quot; : false,&lt;br/&gt;
	&quot;nYields&quot; : 33638,&lt;br/&gt;
	&quot;nChunkSkips&quot; : 0,&lt;br/&gt;
	&quot;millis&quot; : 32122,&lt;br/&gt;
	&quot;indexBounds&quot; : &lt;/p&gt;
{
		&quot;address.gisLocation&quot; : [ ]
	}
&lt;p&gt;,&lt;br/&gt;
	&quot;lookedAt&quot; : NumberLong(4443480),&lt;br/&gt;
	&quot;matchesPerfd&quot; : NumberLong(0),&lt;br/&gt;
	&quot;objectsLoaded&quot; : NumberLong(4392328),&lt;br/&gt;
	&quot;pointsLoaded&quot; : NumberLong(0),&lt;br/&gt;
	&quot;pointsSavedForYield&quot; : NumberLong(0),&lt;br/&gt;
	&quot;pointsChangedOnYield&quot; : NumberLong(0),&lt;br/&gt;
	&quot;pointsRemovedOnYield&quot; : NumberLong(0),&lt;br/&gt;
	&quot;server&quot; : &quot;Matthews-MacBook-Pro.local:30017&quot;&lt;br/&gt;
}&lt;/p&gt;
</comment>
                            <comment id="305137" author="auto" created="Wed, 3 Apr 2013 15:45:34 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;date&apos;: u&apos;2013-04-01T16:03:14Z&apos;, u&apos;name&apos;: u&apos;Hari Khalsa&apos;, u&apos;email&apos;: u&apos;hkhalsa@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9064&quot; title=&quot;Count of Geo Querys of Millions of Documents Extremely Slow&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9064&quot;&gt;&lt;del&gt;SERVER-9064&lt;/del&gt;&lt;/a&gt; speed up 2dsphere point-in-poly containment&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/78c31beb18a47f431da12004c8f820136b0dc857&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/78c31beb18a47f431da12004c8f820136b0dc857&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="295782" author="mattx" created="Fri, 22 Mar 2013 13:45:21 +0000"  >&lt;p&gt;This information is also in the link above but just as a point of comparison, Postgresql with PostGIS returns results in approximately 3-5s and ElasticSearch in approximately 2-4s so with the right index this should be a fast operation.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 25 Mar 2013 13:53:25 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        10 years, 41 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            10 years, 41 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>hari.khalsa@10gen.com</customfieldvalue>
            <customfieldvalue>mattx</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrn0jb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrmuxz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>47103</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht0nsv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>