<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:36:35 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-14985] Merge stages in aggregation should be distributed beyond primary shard</title>
                <link>https://jira.mongodb.org/browse/SERVER-14985</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;&lt;a href=&quot;http://docs.mongodb.org/manual/core/aggregation-pipeline-sharded-collections/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/core/aggregation-pipeline-sharded-collections/&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&quot;The second pipeline consists of the remaining pipeline stages and runs on the primary shard. The primary shard merges the cursors from the other shards and runs the second pipeline on these results. The primary shard forwards the final results to the mongos. In previous versions, the second pipeline would run on the mongos.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;ul&gt;
	&lt;li&gt;This prevents scaling out non-trivial aggregation pipeline queries&lt;/li&gt;
	&lt;li&gt;As a specific case, consider that when using $redact, then &lt;b&gt;all&lt;/b&gt; of your queries become aggregation queries, so 100% of your reads will have to flow through the primary shard, even if they otherwise would be a targeted query.&lt;/li&gt;
	&lt;li&gt;Minor, but related: Note that selection of the primary shard is usually implicit, ie random from the user point of view and cannot be changed afterwards.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="153995">SERVER-14985</key>
            <summary>Merge stages in aggregation should be distributed beyond primary shard</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="henrik.ingo@mongodb.com">Henrik Ingo</reporter>
                        <labels>
                    </labels>
                <created>Thu, 21 Aug 2014 07:46:14 +0000</created>
                <updated>Fri, 12 Jun 2015 07:16:44 +0000</updated>
                            <resolved>Wed, 8 Apr 2015 19:51:54 +0000</resolved>
                                    <version>2.6.0</version>
                                                    <component>Aggregation Framework</component>
                                        <votes>1</votes>
                                    <watches>14</watches>
                                                                                                                <comments>
                            <comment id="876234" author="dan@10gen.com" created="Wed, 8 Apr 2015 19:51:54 +0000"  >&lt;p&gt;closing as duplicate of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-17737&quot; title=&quot;Support distributed merger for aggregation queries&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-17737&quot;&gt;SERVER-17737&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="699100" author="redbeard0531" created="Thu, 21 Aug 2014 23:16:12 +0000"  >&lt;p&gt;The reasons that the merge stages in agg were moved from the mongos to the primary shard are largely due to issues in mongos unrelated to agg that conflicted with our plans for 2.6. It was decided that fixing these issues was out-of-scope for 2.6.&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Mongos isn&apos;t allowed to write to disk (except when opting in to log files) so we could not use the external sorter (&quot;allowDiskUse&quot;) there.&lt;/li&gt;
	&lt;li&gt;Mongos doesn&apos;t really have any of its own cursors, it just forwards its requests to the shards. In particular cursors from a single shard are assumed to be from the primary shard for the database. It also has no mechanism to kill cursors, so if a client goes away before issuing a killCursor or exhausting the cursor, it will be leaked. Since agg cursors can be much heavier than other cursors, this was deemed unacceptable.&lt;/li&gt;
	&lt;li&gt;$out needs to write to the primary shard, so the data might as well go there. This is because sharded ouput was out-of-scope for 2.6, and all unsharded collections must live on the primary shard (&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-939&quot; title=&quot;Ability to distribute collections in a single db&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-939&quot;&gt;SERVER-939&lt;/a&gt;). Note that map/reduce does have a sharded output, but there are a number of issues there (e.g., &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-12261&quot; title=&quot;Map Reduce with sharded output collection creates orphan documents&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-12261&quot;&gt;&lt;del&gt;SERVER-12261&lt;/del&gt;&lt;/a&gt;, &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-7926&quot; title=&quot;Map Reduce with sharded output can apply reduce on duplicate documents if a migration happened&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-7926&quot;&gt;&lt;del&gt;SERVER-7926&lt;/del&gt;&lt;/a&gt;, &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-14324&quot; title=&quot;MapReduce does not respect existing shard key on output:sharded&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-14324&quot;&gt;&lt;del&gt;SERVER-14324&lt;/del&gt;&lt;/a&gt;), and we decided not to emulate the map/reduce approach to sharded output.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Additionally, agg was the only case where we did &quot;heavy lifting&quot; inside of mongos, both in terms of CPU and memory usage (m/r does all real work on the shards like 2.6 agg). This conflicted with the design principle that mongos should be very lightweight and can be run on users app servers without contending too heavily for resources.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="191967">SERVER-17737</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="129095">DOCS-3064</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="210028">SERVER-18925</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10220">
                    <name>Tested</name>
                                            <outwardlinks description="tested by">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 21 Aug 2014 08:04:52 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        8 years, 45 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[<s><a href='https://jira.mongodb.org/browse/CAP-1225'>CAP-1225</a></s>]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>kevin.pulo@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            8 years, 45 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>dan@mongodb.com</customfieldvalue>
            <customfieldvalue>henrik.ingo@mongodb.com</customfieldvalue>
            <customfieldvalue>mathias@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlpfj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hs1ka7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>133704</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrj91b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>