<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 08:06:51 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[DOCS-13065] Investigate changes in SERVER-9507: Optimize $sort+$group+$first pipeline to avoid full index scan</title>
                <link>https://jira.mongodb.org/browse/DOCS-13065</link>
                <project id="10380" key="DOCS">Documentation</project>
                    <description>&lt;h2&gt;&lt;a name=&quot;Description&quot;&gt;&lt;/a&gt;Description&lt;/h2&gt;

    &lt;div class=&quot;panel&quot; style=&quot;background-color: #c2d2c2;border-width: 1px;&quot;&gt;&lt;div class=&quot;panelHeader&quot; style=&quot;border-bottom-width: 1px;background-color: #239eb0;&quot;&gt;&lt;b&gt;Downstream Change Summary&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;panelContent&quot; style=&quot;background-color: #c2d2c2;&quot;&gt;
&lt;p&gt;    The section titled &quot;Pipeline Operators and Indexes&quot; from &lt;a href=&quot;https://docs.mongodb.com/manual/core/aggregation-pipeline/#pipeline-operators-and-indexes&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/core/aggregation-pipeline/#pipeline-operators-and-indexes&lt;/a&gt; should be updated due to this change. It currently lists $match, $sort, and $geoNear as eligible to result in index use. However, this list is not exhaustive. Due to this change, a pipeline with a $group at the beginning can use an index, even if there is no $sort stage.&lt;/p&gt;

&lt;p&gt;In addition to documenting this new behavior, I suggest that we change the language of this page so that it does not claim to be exhaustive. That is, it should not say &quot;Stages X, Y, and Z can result in index use.&quot; Instead, it should say something like &quot;MongoDB&apos;s query planner analyzes an aggregation pipeline in order to determine whether indexes can be used to accelerate the operation. For example, an index can be used for filtering if a $match is at the beginning of the pipeline, or can be moved to the beginning of the pipeline by the optimizer. Similarly, a $sort at the beginning of the pipeline can be computed by scanning an index in order. As a final example, $group stages which obtain the distinct values of a field can use an index for the distinct operation if they occur at the beginning of the pipeline.&quot;&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;h2&gt;&lt;a name=&quot;DescriptionofLinkedTicket&quot;&gt;&lt;/a&gt;Description of Linked Ticket&lt;/h2&gt;
&lt;p&gt;    This is an analogue to &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2094&quot; title=&quot;distinct cheat with indexes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2094&quot;&gt;&lt;del&gt;SERVER-2094&lt;/del&gt;&lt;/a&gt; (&quot;distinct cheat with indexes&quot;), but for the aggregation framework.&lt;/p&gt;

&lt;p&gt;This performance improvement is to allow $group operators like $first to be able to take advantage of the fact that the input to the pipeline is sorted, and thus reduce the number of index entries scanned by &quot;skipping&quot; processing of large portions of the pipeline.&lt;/p&gt;

&lt;p&gt;For example, suppose a user has a collection with an index {&lt;tt&gt;x:1,y:1&lt;/tt&gt;}, and that &lt;tt&gt;x&lt;/tt&gt; has low cardinality.  Consider the following pipeline:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;db.foo.aggregate({$sort:{x:1,y:1}},{$group:{_id:{x:&lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;$x&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;},y:{$first:&lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;$y&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;}}})&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;Currently, the above pipeline will perform a full scan of the index.  After this optimization, the above pipeline will only have to scan on the order of &lt;tt&gt;|x|&lt;/tt&gt; index entries, which is much smaller than the size of the index.&lt;/p&gt;

&lt;p&gt;This ticket is filed as a result of discussion in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9272&quot; title=&quot;Querying latest document based on a set of field&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9272&quot;&gt;&lt;del&gt;SERVER-9272&lt;/del&gt;&lt;/a&gt; (full use case available there).&lt;/p&gt;


&lt;h2&gt;&lt;a name=&quot;Scopeofchanges&quot;&gt;&lt;/a&gt;Scope of changes&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;ImpacttoOtherDocs&quot;&gt;&lt;/a&gt;Impact to Other Docs&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;MVP%28WorkandDate%29&quot;&gt;&lt;/a&gt;MVP (Work and Date)&lt;/h2&gt;

&lt;h2&gt;&lt;a name=&quot;Resources%28ScopeorDesignDocs%2CInvision%2Cetc.%29&quot;&gt;&lt;/a&gt;Resources (Scope or Design Docs, Invision, etc.)&lt;/h2&gt;
</description>
                <environment></environment>
        <key id="947125">DOCS-13065</key>
            <summary>Investigate changes in SERVER-9507: Optimize $sort+$group+$first pipeline to avoid full index scan</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="jeffrey.allen@mongodb.com">Jeffrey Allen</assignee>
                                    <reporter username="backlog-server-pm">Backlog - Core Eng Program Management Team</reporter>
                        <labels>
                    </labels>
                <created>Tue, 1 Oct 2019 14:49:36 +0000</created>
                <updated>Mon, 13 Nov 2023 18:15:42 +0000</updated>
                            <resolved>Mon, 13 Jan 2020 22:16:10 +0000</resolved>
                                                    <fixVersion>4.1.4</fixVersion>
                    <fixVersion>Server_Docs_20231030</fixVersion>
                    <fixVersion>Server_Docs_20231106</fixVersion>
                    <fixVersion>Server_Docs_20231105</fixVersion>
                    <fixVersion>Server_Docs_20231113</fixVersion>
                                    <component>manual</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                    <issuelinks>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                            <outwardlinks description="documents">
                                        <issuelink>
            <issuekey id="73561">SERVER-9507</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="2127296">SERVER-69359</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 1 Oct 2019 14:57:40 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 19 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>DOCS-11762</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emet.ozar@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 19 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-pm</customfieldvalue>
            <customfieldvalue>jeffrey.allen@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvtntb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hvi7o7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="1324">KANBAN BUCKET</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10555" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                        <customfieldname>Story Points</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvta2n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                </customfields>
    </item>
</channel>
</rss>