<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 02:59:51 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-2398] for inline mapreduce, all emitted objects are kept in RAM before the 1st reduce, potential high memory usage</title>
                <link>https://jira.mongodb.org/browse/SERVER-2398</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;during map phase, checkSize() is called to do reduceInMemory and potentially dumpToInc.&lt;br/&gt;
But if inline mode, checkSize doesnt do anything.&lt;br/&gt;
All objects will be emitted before 1st attempt to reduce.&lt;br/&gt;
Instead reduceInMemory should be called if map is over a certain size, or if there is potential for reduce.&lt;/p&gt;</description>
                <environment></environment>
        <key id="14289">SERVER-2398</key>
            <summary>for inline mapreduce, all emitted objects are kept in RAM before the 1st reduce, potential high memory usage</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="antoine">Antoine Girbal</assignee>
                                    <reporter username="antoine">Antoine Girbal</reporter>
                        <labels>
                    </labels>
                <created>Mon, 24 Jan 2011 06:58:25 +0000</created>
                <updated>Tue, 12 Jul 2016 00:17:41 +0000</updated>
                            <resolved>Wed, 26 Jan 2011 18:35:48 +0000</resolved>
                                                    <fixVersion>1.7.5</fixVersion>
                                                        <votes>1</votes>
                                    <watches>0</watches>
                                                                                                                <comments>
                            <comment id="22679" author="antoine" created="Tue, 25 Jan 2011 21:48:35 +0000"  >&lt;p&gt;Here is a test that shows problem&lt;br/&gt;
Add 1000000 docs to col:&lt;br/&gt;
foo:PRIMARY&amp;gt; for (var i = 0; i &amp;lt; 1000000; ++i){ db.large.save(&lt;/p&gt;
{a: Math.random(10000), str: &quot;aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&quot;}
&lt;p&gt;) }&lt;/p&gt;

&lt;p&gt;Use an emit that always uses the same key:&lt;br/&gt;
foo:PRIMARY&amp;gt; map = function() &lt;/p&gt;
{ emit(1, 1); }
&lt;p&gt;function () {&lt;br/&gt;
    emit(1, 1);&lt;br/&gt;
}&lt;br/&gt;
foo:PRIMARY&amp;gt; reduce = function(key, vals) { var sum = 0; for (var i = 0; i &amp;lt; vals.length; ++i) &lt;/p&gt;
{ sum += vals[i]; }
&lt;p&gt; return sum;  }function (key, vals) {&lt;br/&gt;
    var sum = 0;&lt;br/&gt;
    for (var i = 0; i &amp;lt; vals.length; ++i) &lt;/p&gt;
{
        sum += vals[i];
    }
&lt;p&gt;    return sum;&lt;br/&gt;
}&lt;/p&gt;

&lt;p&gt;Then apply MR:&lt;br/&gt;
foo:PRIMARY&amp;gt; a = db.large.mapReduce(map, reduce, {out: { inline : 1}});&lt;br/&gt;
The operation is very long because the internal map gets large.&lt;br/&gt;
Actually let it run for 1000s and eventually just killed it..&lt;br/&gt;
Also the resident memory usage increases to 1GB and beyond.&lt;/p&gt;

&lt;p&gt;Added a fix where data gets reduced every 50KB IF there are potential duplicate.&lt;br/&gt;
Now operation completes within 20s.&lt;br/&gt;
Also the memory usage of mongod does not increase at all (356MB).&lt;br/&gt;
foo:PRIMARY&amp;gt; a = db.large.mapReduce(map, reduce, {out: { inline : 1}});&lt;br/&gt;
{&lt;br/&gt;
	&quot;results&quot; : [&lt;/p&gt;
		{
			&quot;_id&quot; : 1,
			&quot;value&quot; : 1020000
		}
&lt;p&gt;	],&lt;br/&gt;
	&quot;timeMillis&quot; : 21974,&lt;br/&gt;
	&quot;counts&quot; : &lt;/p&gt;
{
		&quot;input&quot; : 1020000,
		&quot;emit&quot; : 1020000,
		&quot;output&quot; : 1
	}
&lt;p&gt;,&lt;br/&gt;
	&quot;ok&quot; : 1,&lt;br/&gt;
}&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        13 years, 4 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            13 years, 4 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>antoine</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrp82f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hricq7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>20713</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht0gjb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>