<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:15:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-7694] external sort for find command</title>
                <link>https://jira.mongodb.org/browse/SERVER-7694</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Right now when query results must be sorted before being returned, a top-N sort is performed in memory with a memory footprint cap enforced.  The goal for this ticket is move to external sorting if too much memory is required, see discussion in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-4716&quot; title=&quot;reexamine scan and order memory limit handling&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-4716&quot;&gt;&lt;del&gt;SERVER-4716&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="56753">SERVER-7694</key>
            <summary>external sort for find command</summary>
                <type id="2" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14711&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="david.storch@mongodb.com">David Storch</assignee>
                                    <reporter username="aaron">Aaron Staple</reporter>
                        <labels>
                            <label>query_triage</label>
                    </labels>
                <created>Fri, 16 Nov 2012 20:55:15 +0000</created>
                <updated>Wed, 9 Oct 2019 17:26:15 +0000</updated>
                            <resolved>Wed, 2 Oct 2019 00:42:13 +0000</resolved>
                                                    <fixVersion>4.3.1</fixVersion>
                                    <component>Querying</component>
                                        <votes>12</votes>
                                    <watches>29</watches>
                                                                                                                <comments>
                            <comment id="2444961" author="david.storch" created="Wed, 2 Oct 2019 00:58:10 +0000"  >&lt;p&gt;As of development version 4.3.1 (which will evolve into the 4.4 GA release), users can opt into allowing disk use for sorting in a &lt;tt&gt;find&lt;/tt&gt; operation. This ability has existed for many releases for &lt;tt&gt;aggregate&lt;/tt&gt; operations: see &lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/aggregate/#aggregate-data-using-external-sort&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this documentation page showing an example of an agg pipeline that requires an external sort&lt;/a&gt;. The same functionality is now available for &lt;tt&gt;find&lt;/tt&gt; operations. In the shell, the &lt;tt&gt;allowDiskUse:true&lt;/tt&gt; parameter to the &lt;tt&gt;find&lt;/tt&gt; command can be set using the following syntax:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;db.collection.find(&amp;lt;match&amp;gt;, &amp;lt;projection&amp;gt;).sort(&amp;lt;sort&amp;gt;).allowDiskUse();&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;By default in versions &amp;gt;=4.3.1, mongod will begin spilling data to disk once the memory requirements exceed 100MB. This maximum memory consumption threshold can be configured at runtime or at startup using the &lt;tt&gt;internalQueryExecMaxBlockingSortBytes&lt;/tt&gt; setParameter (see &lt;a href=&quot;https://docs.mongodb.com/manual/reference/parameters/#synopsis&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this page&lt;/a&gt; for details on how to configure setParameters). If &lt;tt&gt;internalQueryExecMaxBlockingSortBytes&lt;/tt&gt; is exceeded when &lt;tt&gt;allowDiskUse&lt;/tt&gt; is true, data will be spilled to disk during execution of the sort; if &lt;tt&gt;allowDiskUse&lt;/tt&gt; is false, the query will fail.&lt;/p&gt;

&lt;p&gt;As part of this change, the execution statistics reported by the &lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/explain/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;explain command&lt;/a&gt; for the SORT stage have changed slightly:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;memUsage&lt;/tt&gt; has been replaced by &lt;tt&gt;totalDataSizeSorted&lt;/tt&gt;, which gives the total number of bytes of data sorted. Since the data now might be either in memory or on disk, &lt;tt&gt;totalDataSizeSorted&lt;/tt&gt; is a better metric for understanding how much data was sorted than the memory usage. A future improvement could additional add an additional metric such as &lt;tt&gt;peakMemoryConsumption&lt;/tt&gt;, which would describe the maximum memory usage over the course of the execution of the sort.&lt;/li&gt;
	&lt;li&gt;A new boolean called &lt;tt&gt;usedDisk&lt;/tt&gt; is now available in order to indicate whether or not the SORT stage had to spill data to disk. The value of &lt;tt&gt;usedDisk&lt;/tt&gt; can only be &lt;tt&gt;true&lt;/tt&gt; if the application has set &lt;tt&gt;allowDiskUse:true&lt;/tt&gt; on either a &lt;tt&gt;find&lt;/tt&gt; or &lt;tt&gt;aggregate&lt;/tt&gt; operation.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="2444902" author="xgen-internal-githook" created="Wed, 2 Oct 2019 00:34:45 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;username&apos;: &apos;dstorch&apos;, &apos;email&apos;: &apos;david.storch@mongodb.com&apos;, &apos;name&apos;: &apos;David Storch&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-7694&quot; title=&quot;external sort for find command&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-7694&quot;&gt;&lt;del&gt;SERVER-7694&lt;/del&gt;&lt;/a&gt; Enable allowing disk use for sorts in the find command.&lt;/p&gt;

&lt;p&gt;For find commands that request a sort, the query execution&lt;br/&gt;
engine can now spill data to disk if the memory requirements&lt;br/&gt;
would exceed &apos;internalQueryExecBlockingSortBytes&apos; and the&lt;br/&gt;
&apos;allowDiskUse&apos; parameter is set to true. By default, the&lt;br/&gt;
memory threshold is currently 100MB. This allows&lt;br/&gt;
applications which need to sort server-side to no longer be&lt;br/&gt;
subject to an arbitrary data size threshold.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/62f03390c9957eba4c250eb3893b873391add3d2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/62f03390c9957eba4c250eb3893b873391add3d2&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="496764" author="lthompson" created="Mon, 10 Feb 2014 22:43:05 +0000"  >&lt;p&gt;Thanks &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=dan%4010gen.com&quot; class=&quot;user-hover&quot; rel=&quot;dan@10gen.com&quot;&gt;dan@10gen.com&lt;/a&gt;, cool, that&apos;s a great step. &lt;/p&gt;

&lt;p&gt;I think that it doesn&apos;t address the core of the problem though.&lt;/p&gt;

&lt;p&gt;We (and probably lots of others) may have written code and deployed it to production which is basically a ticking time-bomb because of this limitation. &lt;/p&gt;

&lt;p&gt;Right now, our datasets are small enough that the sort is occurring in memory, but as the data grows we near the memory cap. &lt;/p&gt;

&lt;p&gt;One day a single record will be added to a collection, and the app will start crashing. &lt;/p&gt;

&lt;p&gt;A warning in the log that a query is using a large amount of resources for sorting and should have an index would be great, but the DB should keep responding even if the query slows when sort data sets no longer fit in memory.&lt;/p&gt;</comment>
                            <comment id="496026" author="dan@10gen.com" created="Mon, 10 Feb 2014 04:51:00 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=lthompson%40infomedia.com.au&quot; class=&quot;user-hover&quot; rel=&quot;lthompson@infomedia.com.au&quot;&gt;lthompson@infomedia.com.au&lt;/a&gt;, starting in the 2.6 MongoDB release, you should be able to use the aggregation framework for unindexed sorts. See &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9444&quot; title=&quot;Use new Sorter for Aggregation $sort and $group&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9444&quot;&gt;&lt;del&gt;SERVER-9444&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="491325" author="lthompson" created="Fri, 31 Jan 2014 02:48:33 +0000"  >&lt;p&gt;Thanks in advance for addressing this. &lt;/p&gt;

&lt;p&gt;I just had a commercial support ticket (CS-10484) to clear up why this behaviour was happening:&lt;br/&gt;
&quot;Is there a way to configure MongoDB to behave as we would have expected it to (basically to suck it up and sort slowly)?&quot;&lt;/p&gt;</comment>
                            <comment id="322554" author="jeff.yemin" created="Fri, 26 Apr 2013 19:50:42 +0000"  >&lt;p&gt;As &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-5374&quot; title=&quot;batchSize is a hard limit for an in memory sort&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-5374&quot;&gt;&lt;del&gt;SERVER-5374&lt;/del&gt;&lt;/a&gt; is marked as a duplicate of this, in order to satisfy that use case, the server needs to ensure that even when the number of results is below the memory footprint cap, queries with an explicit batch size should return the same number of documents as those without.  Given the wire protocol, there is no way for the server to tell the difference between limit and batch size as specified in the driver API, so to make this work properly, the server has to return a cursor if number of results &amp;gt; numberToReturn, regardless of whether it chooses to use external sort or in memory sort.&lt;/p&gt;</comment>
                            <comment id="243605" author="eliot" created="Fri, 18 Jan 2013 16:06:38 +0000"  >&lt;p&gt;@aaron - yes, exactly.&lt;/p&gt;</comment>
                            <comment id="242833" author="aaron" created="Thu, 17 Jan 2013 19:48:26 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-5374&quot; title=&quot;batchSize is a hard limit for an in memory sort&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-5374&quot;&gt;&lt;del&gt;SERVER-5374&lt;/del&gt;&lt;/a&gt; (batchSize is a hard limit for an in memory sort) was closed as a duplicate of this ticket.  This suggests that we may want to change the behavior of scan and order queries to support getMore (retrieving results in batches instead of all in the initial query).&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                        <issuelink>
            <issuekey id="42006">SERVER-6157</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="56498">SERVER-7676</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10320">
                    <name>Documented</name>
                                                                <inwardlinks description="is documented by">
                                        <issuelink>
            <issuekey id="948103">DOCS-13066</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="33594">SERVER-5374</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="72890">SERVER-9444</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="280165">SERVER-23768</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="943098">SERVER-43683</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="28645">SERVER-4716</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="22291">SERVER-3867</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[500A000000aPJFwIAO, 5002K00000d6x6VQAQ]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 18 Jan 2013 16:06:38 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 19 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[<s><a href='https://jira.mongodb.org/browse/SERVER-6157'>SERVER-6157</a></s>]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_17052" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Downstream Changes Summary</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>See public comment left immediately after the link to the git commit.</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16942"><![CDATA[Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-852</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>david.storch@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 19 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>aaron</customfieldvalue>
            <customfieldvalue>dan@mongodb.com</customfieldvalue>
            <customfieldvalue>david.storch@mongodb.com</customfieldvalue>
            <customfieldvalue>eliot</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>jeff.yemin@mongodb.com</customfieldvalue>
            <customfieldvalue>lthompson</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrnhbr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr6ssf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3894</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2886">Query 2019-10-07</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_17051" key="com.atlassian.jira.plugin.system.customfieldtypes:multicheckboxes">
                        <customfieldname>Teams Impacted</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16944"><![CDATA[Docs]]></customfieldvalue>
    <customfieldvalue key="16946"><![CDATA[Triage and Release]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hriy9r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>