<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:15:44 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-27637] Merging phase of a distributed $sample should recognize input streams are already sorted</title>
                <link>https://jira.mongodb.org/browse/SERVER-27637</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;When issuing an aggregation starting with a &lt;tt&gt;$sample&lt;/tt&gt; of size N on a sharded collection, we split the &lt;tt&gt;$sample&lt;/tt&gt; into two parts: First to gather a sample of size N on each shard, and second to merge the samples together for a final sample which potentially includes documents from each shard. The first part can be achieved by either (1) doing a full sort of all documents based on injected random values, then taking the top N or (2) doing repeated random cursor walks over an index until we get N unique documents. During approach (2) we inject random values into the documents after-the-fact in such a way that the output documents are still in decreasing order of random value.&lt;/p&gt;

&lt;p&gt;In either case, the documents output from each shard will be in order by a random metadata field, and just need to be merged in a &quot;merge sorted streams&quot; style. This makes the merging half of the &lt;tt&gt;$sample&lt;/tt&gt; stage equivalent to the merging half of a &lt;tt&gt;$sort&lt;/tt&gt; stage, so when splitting a &lt;tt&gt;$sample&lt;/tt&gt; stage we generate a {&lt;tt&gt;sample: {size: N&lt;/tt&gt;}} to run on all the shards, and a {&lt;tt&gt;$sort: {sortKey: {$computed0: {meta: &quot;randVal&quot;&lt;/tt&gt;}}}} to run on the merging shard. This &lt;tt&gt;$sort&lt;/tt&gt; stage should also include the &quot;mergingPresorted&quot; option, so that it can take advantage of the fact that the inputs are already sorted and avoid the need to spill to disk.&lt;/p&gt;

&lt;p&gt;The net impact of this is that today, a &lt;tt&gt;$sample&lt;/tt&gt; issued against a sharded collection can error with a message indicating that the user needs to pass &apos;allowDiskUse: true&apos; to perform the sort on the merging shard, even though no disk use is required. It should also speed up all distributed {{$sample}}s if we take advantage of the already sorted streams.&lt;/p&gt;</description>
                <environment></environment>
        <key id="344468">SERVER-27637</key>
            <summary>Merging phase of a distributed $sample should recognize input streams are already sorted</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="bernard.gorman@mongodb.com">Bernard Gorman</assignee>
                                    <reporter username="charlie.swanson@mongodb.com">Charlie Swanson</reporter>
                        <labels>
                    </labels>
                <created>Wed, 11 Jan 2017 17:02:15 +0000</created>
                <updated>Mon, 14 Aug 2017 15:30:27 +0000</updated>
                            <resolved>Mon, 14 Aug 2017 15:30:27 +0000</resolved>
                                                                    <component>Aggregation Framework</component>
                                        <votes>1</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="1643368" author="bernard.gorman" created="Tue, 8 Aug 2017 18:50:40 +0000"  >&lt;p&gt;This issue was resolved &lt;a href=&quot;https://github.com/mongodb/mongo/commit/9062f71621d053e32beec1061f708e3fd08b0158&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;with this commit&lt;/a&gt; as part of the work on &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-22760&quot; title=&quot;Sharded aggregation pipelines which involve taking a simple union should merge on mongos&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-22760&quot;&gt;&lt;del&gt;SERVER-22760&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="266763">SERVER-22760</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 27 Jul 2017 15:22:58 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 27 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>david.storch@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 27 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>bernard.gorman@mongodb.com</customfieldvalue>
            <customfieldvalue>charlie.swanson@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht0rxj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hra1n3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="1823">Query 2017-08-21</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hs4jmf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>