<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 02:59:40 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-2340] MapReduce finalize should be able to throw result row away</title>
                <link>https://jira.mongodb.org/browse/SERVER-2340</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I have a use case where I would like to throw away a result in finalize phase withing map reduce. AFAIK currently finalize can only modify the result object but not remove it completely.&lt;/p&gt;

&lt;p&gt;My use case consists a map reduce where I first emit &lt;/p&gt;
{ count : 1 }
&lt;p&gt; in map phase and then I sum the counts together in reduce phase. Then I would like to discard all results which count is less than some value and return only those which count is greater than my requirement. In practice the finalize will discard 99.99% of my results away so it would be much more efficient to do it there instead of manually iterating or querying the result temp collection.&lt;/p&gt;

&lt;p&gt;I propose that returning a null in finalize phase would discard the result. Currently all examples of the finalize function will return the result object, so implementing this would not change the current behavior. &lt;/p&gt;</description>
                <environment></environment>
        <key id="14187">SERVER-2340</key>
            <summary>MapReduce finalize should be able to throw result row away</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="5" iconUrl="https://jira.mongodb.org/images/icons/priorities/trivial.svg">Trivial - P5</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="backlog-query-optimization">Backlog - Query Optimization</assignee>
                                    <reporter username="garo">Juho M&#228;kinen</reporter>
                        <labels>
                    </labels>
                <created>Mon, 10 Jan 2011 10:19:02 +0000</created>
                <updated>Tue, 6 Dec 2022 05:46:14 +0000</updated>
                            <resolved>Fri, 4 Feb 2022 15:09:58 +0000</resolved>
                                                                    <component>MapReduce</component>
                                        <votes>13</votes>
                                    <watches>16</watches>
                                                                                                                <comments>
                            <comment id="4335990" author="esha.bhargava" created="Fri, 4 Feb 2022 15:08:22 +0000"  >&lt;p&gt;Closing these tickets as part of the deprecation of mapReduce.&lt;/p&gt;</comment>
                            <comment id="2400874" author="asya" created="Fri, 30 Aug 2019 06:24:21 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=kostano%40yahoo.com&quot; class=&quot;user-hover&quot; rel=&quot;kostano@yahoo.com&quot;&gt;kostano@yahoo.com&lt;/a&gt; I recommend you reconsider using map-reduce for your use case and see if it&apos;s possible to do it in aggregation pipeline.&#160; We are moving away from support map-reduce and don&apos;t plan to add new functionality to it, in favor of enhancing aggregation pipeline so that it can do everything map-reduce can currently do (and much more).&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;Is there a reason you cannot use aggregation to perform the workflow you described?&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="2400376" author="kostano@yahoo.com" created="Thu, 29 Aug 2019 19:40:47 +0000"  >&lt;p&gt;Hi everybody, I need this functionality also.&lt;/p&gt;

&lt;p&gt;This is my business case.&lt;/p&gt;

&lt;p&gt;I have small collection A ~ 0.5 mill documents which I need to populate with information stored in &quot;lookup&quot; collection B which is 80 mill documents.&lt;/p&gt;

&lt;p&gt;The challenge is complex mapping logic to match&#160; document in A with document in B.&lt;/p&gt;

&lt;p&gt;There is no way I can create index in B.&lt;/p&gt;

&lt;p&gt;My plan was to use map reduce - in map function for collection B create key (following complex logic, can be more then one document emitted, analogy of flat map for java streams)&#160; with required information in value and output to collection A so matched keys would be passed to reduce function.&lt;/p&gt;

&lt;p&gt;In reduce function add specific tag to a value so I could identify documents that was processed by reduce function.&lt;/p&gt;

&lt;p&gt;In finalize function I would return null so not required documents would be discarded.&#160;&lt;/p&gt;

&lt;p&gt;Another possibility would be add filter condition for map reduce result, same kind of approach like used for filtering mapping documents with query tag.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="1297032" author="asya" created="Thu, 16 Jun 2016 19:16:11 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=gargsatish0&quot; class=&quot;user-hover&quot; rel=&quot;gargsatish0&quot;&gt;gargsatish0&lt;/a&gt; aggregation pipeline can handle very complex conditionals (though not all, obviously) if there are missing functions then please be sure and add a new server ticket requesting new functionality for aggregation so that it can be the solution for your use case.&lt;/p&gt;
</comment>
                            <comment id="1291473" author="gargsatish0" created="Sat, 11 Jun 2016 22:20:23 +0000"  >&lt;p&gt;Hi,&lt;br/&gt;
Any update on this?&lt;br/&gt;
Is it possible now to remove key from map reduce finalize function or any such workaround that i may try?&lt;/p&gt;

&lt;p&gt;Also, cannot use Aggregation pipeline as  I need complex conditional projections to make,&lt;br/&gt;
which are currently not supported.&lt;/p&gt;

&lt;p&gt;And, I can&apos;t do this application side as both source and resultant dataset is huge.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Satish&lt;/p&gt;</comment>
                            <comment id="923287" author="maziyar" created="Tue, 26 May 2015 14:03:51 +0000"  >&lt;p&gt;I would also appreciate if it&apos;s possible to remove the key from the results at the finalize stage. One of the best things in finalize is to see if the results at the end meets our conditions. From millions of documents it will be reduced to couple of thousands which is a huge improvements on inserts and also further operations on the result_collection.&lt;/p&gt;

&lt;p&gt;If it was possible to use aggregation I would have definitely used it by now as I do for so many other things since it&apos;s faster, easier and more convenient but it is not as flexible as MR!&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Maziyar&lt;/p&gt;</comment>
                            <comment id="754539" author="asya" created="Mon, 3 Nov 2014 15:34:58 +0000"  >&lt;p&gt;This is easily done if your MR can be done as aggregation pipeline (as original example can be)...&lt;/p&gt;</comment>
                            <comment id="754379" author="jalava" created="Mon, 3 Nov 2014 10:06:25 +0000"  >&lt;p&gt;I need this feature in following use case:&lt;/p&gt;

&lt;p&gt;Due to performance requirements we are forced to use &lt;/p&gt;
{ inline: 1&#160;}
&lt;p&gt; as output.&lt;/p&gt;

&lt;p&gt;We are reducing from huge data set to dataset that is over 16 megabytes (which is the BSON size limit), but 90% of data could be discarded on finalize by just returning null.&lt;/p&gt;

&lt;p&gt;Add option &lt;/p&gt;
{ discardNullOnFinalize: 1}
&lt;p&gt; if you are concerned that users might want to return null object with key still in result set.&lt;/p&gt;</comment>
                            <comment id="311171" author="lalitagarw" created="Thu, 11 Apr 2013 15:00:23 +0000"  >&lt;p&gt;I am also facing the same issue. Any updates on this improvement?&lt;/p&gt;</comment>
                            <comment id="60900" author="antoine" created="Mon, 17 Oct 2011 18:17:28 +0000"  >&lt;p&gt;we need to agree on feature 1st&lt;/p&gt;</comment>
                            <comment id="60899" author="auto" created="Mon, 17 Oct 2011 18:17:15 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;agirbal&apos;, u&apos;name&apos;: u&apos;agirbal&apos;, u&apos;email&apos;: u&apos;antoine@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2340&quot; title=&quot;MapReduce finalize should be able to throw result row away&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2340&quot;&gt;&lt;del&gt;SERVER-2340&lt;/del&gt;&lt;/a&gt;: undoing until we decide on feature&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/2a8522c03f0259bbc9134a93b88bb00f5f2c58ce&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/2a8522c03f0259bbc9134a93b88bb00f5f2c58ce&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="60898" author="auto" created="Mon, 17 Oct 2011 18:17:14 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;agirbal&apos;, u&apos;name&apos;: u&apos;agirbal&apos;, u&apos;email&apos;: u&apos;antoine@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2340&quot; title=&quot;MapReduce finalize should be able to throw result row away&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2340&quot;&gt;&lt;del&gt;SERVER-2340&lt;/del&gt;&lt;/a&gt;: MapReduce finalize should be able to throw result row away&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/20a1c7ae3294f9e475adf2a0212726d2dc21b69b&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/20a1c7ae3294f9e475adf2a0212726d2dc21b69b&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="60896" author="antoine" created="Mon, 17 Oct 2011 17:55:09 +0000"  >&lt;p&gt;to use this feature, return null in finalize.&lt;br/&gt;
M/R will omit that key from the final result.&lt;/p&gt;</comment>
                            <comment id="22207" author="garo" created="Mon, 10 Jan 2011 10:20:03 +0000"  >&lt;p&gt;Example of a finalize function which would fit the use case:&lt;/p&gt;


&lt;p&gt;f = function (key, value) {&lt;br/&gt;
        if (value.count &amp;gt; 100) &lt;/p&gt;
{
                return value;
        }
&lt;p&gt; else &lt;/p&gt;
{
                return null;
        }
&lt;p&gt;}&lt;/p&gt;
</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25126"><![CDATA[Query Optimization]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10011"><![CDATA[Minor Change]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 17 Oct 2011 17:37:34 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        2 years, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            2 years, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>antoine</customfieldvalue>
            <customfieldvalue>asya.kamsky@mongodb.com</customfieldvalue>
            <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>backlog-query-optimization</customfieldvalue>
            <customfieldvalue>esha.bhargava@mongodb.com</customfieldvalue>
            <customfieldvalue>jalava</customfieldvalue>
            <customfieldvalue>garo</customfieldvalue>
            <customfieldvalue>kostano@yahoo.com</customfieldvalue>
            <customfieldvalue>lalitagarw</customfieldvalue>
            <customfieldvalue>maziyar</customfieldvalue>
            <customfieldvalue>gargsatish0</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrp8sv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr2g07:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4931</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrqvif:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>