<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:08:35 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-45364] Query Planner should estimate cost of each predicate</title>
                <link>https://jira.mongodb.org/browse/SERVER-45364</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;This can be used to choose a more efficient order of execution for predicates like $or and $elemMatch.&lt;/p&gt;

&lt;h1&gt;&lt;a name=&quot;OriginalTitle&quot;&gt;&lt;/a&gt;Original Title&lt;/h1&gt;
&lt;p&gt;The Query Parser reorders $elemMatch ahead of equality conditions when evaluating a $or set.&lt;/p&gt;

&lt;h1&gt;&lt;a name=&quot;OriginalDescription&quot;&gt;&lt;/a&gt;Original Description&lt;/h1&gt;
&lt;p&gt;I created a collection with 200 documents, all in this format:&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;  { _id: ObjectId,&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;    aphaField: true,&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;    giantArray: [array of 20,000 documents like {ae: intValue}]&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;    zuluField: true }&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;
&lt;p&gt;The values for giantArray.ae are random values, but the first 19,999 are all &amp;gt; 17.&lt;br/&gt;
 The last one has value 11.&lt;/p&gt;

&lt;p&gt;Then I ran &lt;tt&gt;find(qry).count()&lt;/tt&gt; against the collection, where &lt;tt&gt;qry&lt;/tt&gt; was each of these filters:&lt;br/&gt;
 1. {&lt;tt&gt;aphaField: {$eq: true&lt;/tt&gt;}}&lt;br/&gt;
 2.&#160;{&lt;tt&gt;giantArray: {$elemMatch: {ae: {$eq: 11&lt;/tt&gt;}}}}&lt;br/&gt;
 3. {&lt;tt&gt;zuluField: {$eq: true&lt;/tt&gt;}}&lt;/p&gt;

&lt;p&gt;I ran each query 21 times. The first time was to warm the cache, and then I recorded how long the next 20 executions took. All queries used a COLLSCAN as the collection has no indexes beyond the default one on _id, and all scanned all 100 documents.&lt;/p&gt;

&lt;p&gt;The equality matches on the boolean fields alone ({&lt;tt&gt;aphaField: {$eq: true&lt;/tt&gt;}} and {&lt;tt&gt;zuluField: {$eq: true&lt;/tt&gt;}}) executed 20 times in 8 milliseconds. The &lt;tt&gt;$elemMatch&lt;/tt&gt; condition on the nested array field &lt;em&gt;&quot;giantArray.ae&quot;&lt;/em&gt; alone was slower, with 20 executions taking 4,616 milliseconds.&lt;/p&gt;

&lt;p&gt;This is not unexpected, as the singleton boolean fields are very fast to test, whereas all 20,000 array members had to be examined to find the one with value 11.&lt;/p&gt;

&lt;p&gt;&amp;lt;br&amp;gt;&lt;/p&gt;

&lt;p&gt;Next I joined the equality tests on the boolean fields and the &lt;tt&gt;$elemMatch&lt;/tt&gt; condition within a &lt;tt&gt;$or&lt;/tt&gt;, both as {&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;X, Y&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&#160;and {&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;Y, X&amp;#93;&lt;/span&gt;&lt;/tt&gt;}, and ran them 21 times each too:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;\{aphaField: {$eq: true}}, \{zuluField: {$eq: true}}&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&lt;/li&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;\{aphaField: {$eq: true}}, \{giantArray: {$elemMatch: {ae: {$eq: 11}}}}&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;\{giantArray: {$elemMatch: {ae: {$eq: 11}}}}, \{aphaField: {$eq: true}}&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&lt;/li&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;\{giantArray: {$elemMatch: {ae: {$eq: 11}}}}, \{zuluField: {$eq: true}}&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;/\{zuluField: /{$eq: true/}/}, /\{aphaField: /{$eq: true/}/}&amp;#93;&lt;/span&gt;/&lt;/tt&gt;}&lt;/li&gt;
	&lt;li&gt;{&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;\{zuluField: {$eq: true}}, \{giantArray: {$elemMatch: {ae: {$eq: 11}}}}&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;When X and Y were simple equality tests on boolean fields (i.e. no &lt;tt&gt;$elemMatch&lt;/tt&gt;), both {&lt;tt&gt;$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;X, Y&amp;#93;&lt;/span&gt;} and {$or: &lt;span class=&quot;error&quot;&gt;&amp;#91;Y, X&amp;#93;&lt;/span&gt;&lt;/tt&gt;}&#160;executed 20 times in 8 to 10 milliseconds. However, when either X or Y was the &lt;tt&gt;$elemMatch&lt;/tt&gt; condition, the Query Planner parser reordered THAT condition to be FIRST in the evaluation of the &lt;tt&gt;$or&lt;/tt&gt;. That resulted in 20 executions of any query involving &lt;tt&gt;$elemMatch&lt;/tt&gt; taking between 4568 and 4915 milliseconds.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-45308&quot; title=&quot;Alphabetical order of field names used in an $or clause drives evaluation order and thus affects performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-45308&quot;&gt;&lt;del&gt;SERVER-45308&lt;/del&gt;&lt;/a&gt;) shows that the Query Planner parser reorders conditions alphabetically, but &lt;tt&gt;$elemMatch&lt;/tt&gt; is clearly a special case, perhaps from a different phase in the parsing. The effect of putting the &lt;tt&gt;$elemMatch&lt;/tt&gt; conditions first was poorer performance because conditions are evaluated in that order too.&lt;/p&gt;

&lt;p&gt;Evaluating a condition against an array field will typically take longer than one on a simple field, because the array will generally have multiple values to be tested. If the condition on the boolean field had been evaluated first these queries would have completed in much less time.&lt;/p&gt;

&lt;p&gt;Conclusions&lt;br/&gt;
 ===========&lt;br/&gt;
 In &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-45308&quot; title=&quot;Alphabetical order of field names used in an $or clause drives evaluation order and thus affects performance&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-45308&quot;&gt;&lt;del&gt;SERVER-45308&lt;/del&gt;&lt;/a&gt;) my conclusion was that no backend code change could help, as due to MongoDB&apos;s flexible schema the Query Parser cannot tell that {{&lt;/p&gt;
{&quot;giantArray.ae&quot;: 11}
&lt;p&gt;}} is a condition on an array field and not just a nested sub-field.&lt;/p&gt;

&lt;p&gt;In contrast, when the query filter employs &lt;tt&gt;$elemMatch&lt;/tt&gt; it should be obvious to the Query Parser that the field involved is an array of documents (or the end user would not have used that operator).&lt;/p&gt;

&lt;p&gt;I suggest that the Query Parser be changed to order &lt;tt&gt;$elemMatch&lt;/tt&gt; conditions LAST within an &lt;tt&gt;$or&lt;/tt&gt; set. Overall I believe this will result in better performance.&lt;/p&gt;</description>
                <environment></environment>
        <key id="1076278">SERVER-45364</key>
            <summary>Query Planner should estimate cost of each predicate</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-query-optimization">Backlog - Query Optimization</assignee>
                                    <reporter username="william.byrne@mongodb.com">William Byrne III</reporter>
                        <labels>
                            <label>qopt-team</label>
                    </labels>
                <created>Mon, 6 Jan 2020 02:44:21 +0000</created>
                <updated>Tue, 6 Dec 2022 02:39:04 +0000</updated>
                                                                            <component>Querying</component>
                                        <votes>0</votes>
                                    <watches>13</watches>
                                                                                                                <comments>
                            <comment id="2709702" author="charlie.swanson" created="Tue, 7 Jan 2020 16:13:27 +0000"  >&lt;p&gt;Thanks for the request &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=william.byrne&quot; class=&quot;user-hover&quot; rel=&quot;william.byrne&quot;&gt;william.byrne&lt;/a&gt;. I&apos;ve generalized the title a bit since the query team believes this is more generally applicable.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=charlie.swanson&quot; class=&quot;user-hover&quot; rel=&quot;charlie.swanson&quot;&gt;charlie.swanson&lt;/a&gt; reminder to search for whether this request already exists.&lt;/p&gt;</comment>
                            <comment id="2704753" author="william.byrne" created="Mon, 6 Jan 2020 02:50:02 +0000"  >&lt;p&gt;The attached files are a reproduction script and it&apos;s output (reformatted to be more compact).&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="1533409">SERVER-52619</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1066857">SERVER-45308</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="616200">SERVER-37530</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="241796" name="elemTest.js" size="2771" author="william.byrne@mongodb.com" created="Mon, 6 Jan 2020 02:48:54 +0000"/>
                            <attachment id="241795" name="elemTest.out.compact" size="1928" author="william.byrne@mongodb.com" created="Mon, 6 Jan 2020 02:48:54 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25126"><![CDATA[Query Optimization]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[5002K00000hwYhAQAU]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 7 Jan 2020 16:13:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 5 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 5 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-query-optimization</customfieldvalue>
            <customfieldvalue>charlie.swanson@mongodb.com</customfieldvalue>
            <customfieldvalue>william.byrne@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hwevxj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr2hbz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hwei6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>