<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:21:47 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-9907] Allow to skip initial count() in mapreduce</title>
                <link>https://jira.mongodb.org/browse/SERVER-9907</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;div class=&quot;panel&quot; style=&quot;background-color: #EEEEEE;border-color: #ccc;border-width: 1px;&quot;&gt;&lt;div class=&quot;panelHeader&quot; style=&quot;border-bottom-width: 1px;border-bottom-color: #ccc;background-color: #6CB33F;&quot;&gt;&lt;b&gt;MongoDB Status as of October 9th, 2013&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;panelContent&quot; style=&quot;background-color: #EEEEEE;&quot;&gt;
&lt;p&gt;&lt;b&gt;ISSUE SUMMARY&lt;/b&gt;&lt;br/&gt;
In order to report progress of ongoing mapReduce jobs, the filter query used for the input documents to the mapReduce job is run to get the total count of documents affected.  For long running queries, this extra logging information is very costly to overall mapReduce run time.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;USER IMPACT&lt;/b&gt;&lt;br/&gt;
This fix is a performance improvement only.  There is a change in the log messages reported in the log during a mapReduce in the case that a filter is used.   Instead of outputting &quot;percentage complete,&quot; a running count of documents processed is reported.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;SOLUTION&lt;/b&gt;&lt;br/&gt;
The issue has been resolved by only using the total count of documents in the ProgressMeter in the case that there is no query filter used.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;WORKAROUNDS&lt;/b&gt;&lt;br/&gt;
There is no workaround.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;PATCHES&lt;/b&gt;&lt;br/&gt;
Production release v2.4.7 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;h4&gt;&lt;a name=&quot;OriginalDescription&quot;&gt;&lt;/a&gt;Original Description&lt;/h4&gt;

&lt;p&gt;A significant portion of the map reduce job may be spent actually matching the input documents.&lt;br/&gt;
Right now we do an initial count() (line 594 mr.cpp) in order to display the progress meter.&lt;/p&gt;

&lt;p&gt;In my production example, about 90% of the time is spent matching the input documents (no ideal way to index further) and consequently the initial count() waste takes half of the entire job time.&lt;/p&gt;

&lt;p&gt;Either:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;remove the initial count() and progress meters should just display how many haven been done instead of % of completion&lt;/li&gt;
	&lt;li&gt;add an option like &quot;in.showProgress: false&quot; to disable the count().&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This map reduce application will have to ingest a large volume of data, and the matching rules are pretty complex, so having that option may save up to 50% of MR execution time.&lt;/p&gt;</description>
                <environment></environment>
        <key id="78683">SERVER-9907</key>
            <summary>Allow to skip initial count() in mapreduce</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="randolph@mongodb.com">Randolph Tan</assignee>
                                    <reporter username="antoine">Antoine Girbal</reporter>
                        <labels>
                    </labels>
                <created>Tue, 11 Jun 2013 22:09:18 +0000</created>
                <updated>Mon, 11 Jul 2016 17:39:16 +0000</updated>
                            <resolved>Tue, 27 Aug 2013 14:07:29 +0000</resolved>
                                                    <fixVersion>2.4.7</fixVersion>
                    <fixVersion>2.5.2</fixVersion>
                                    <component>MapReduce</component>
                                        <votes>0</votes>
                                    <watches>7</watches>
                                                                                                                <comments>
                            <comment id="434963" author="auto" created="Wed, 2 Oct 2013 23:48:19 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;renctan&apos;, u&apos;name&apos;: u&apos;Randolph Tan&apos;, u&apos;email&apos;: u&apos;randolph@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9907&quot; title=&quot;Allow to skip initial count() in mapreduce&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9907&quot;&gt;&lt;del&gt;SERVER-9907&lt;/del&gt;&lt;/a&gt; Allow to skip initial count() in mapreduce&lt;/p&gt;

&lt;p&gt;Do not count total documents to process if filter is given.&lt;br/&gt;
Branch: v2.4&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/2711fa56d83006ddf91978f8e749074769cc121a&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/2711fa56d83006ddf91978f8e749074769cc121a&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="434962" author="auto" created="Wed, 2 Oct 2013 23:48:17 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;renctan&apos;, u&apos;name&apos;: u&apos;Randolph Tan&apos;, u&apos;email&apos;: u&apos;randolph@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9907&quot; title=&quot;Allow to skip initial count() in mapreduce&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9907&quot;&gt;&lt;del&gt;SERVER-9907&lt;/del&gt;&lt;/a&gt; Allow to skip initial count() in mapreduce&lt;/p&gt;

&lt;p&gt;Added option to hide the total in the progress meter.&lt;/p&gt;

&lt;p&gt;Conflicts:&lt;br/&gt;
	src/mongo/util/progress_meter.cpp&lt;br/&gt;
Branch: v2.4&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/eb13f30cec8e9fcecddc0e91e7db85ba033e1415&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/eb13f30cec8e9fcecddc0e91e7db85ba033e1415&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="391901" author="auto" created="Tue, 30 Jul 2013 15:50:46 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;renctan&apos;, u&apos;name&apos;: u&apos;Randolph Tan&apos;, u&apos;email&apos;: u&apos;randolph@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9907&quot; title=&quot;Allow to skip initial count() in mapreduce&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9907&quot;&gt;&lt;del&gt;SERVER-9907&lt;/del&gt;&lt;/a&gt; Allow to skip initial count() in mapreduce&lt;/p&gt;

&lt;p&gt;Do not count total documents to process if filter is given.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/9f6cf548d5fe19daf4478e5ffd4072a3993302e0&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/9f6cf548d5fe19daf4478e5ffd4072a3993302e0&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="391900" author="auto" created="Tue, 30 Jul 2013 15:50:44 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;renctan&apos;, u&apos;name&apos;: u&apos;Randolph Tan&apos;, u&apos;email&apos;: u&apos;randolph@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9907&quot; title=&quot;Allow to skip initial count() in mapreduce&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9907&quot;&gt;&lt;del&gt;SERVER-9907&lt;/del&gt;&lt;/a&gt; Allow to skip initial count() in mapreduce&lt;/p&gt;

&lt;p&gt;Added option to hide the total in the progress meter.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/0bc4c30550668d547889b80af209f7623e72a1a9&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/0bc4c30550668d547889b80af209f7623e72a1a9&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="358744" author="dan@10gen.com" created="Wed, 12 Jun 2013 16:25:40 +0000"  >&lt;p&gt;Should remove the count in the case that there is a filter.  Need to figure out how this will work with the progress meter/logging.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="110575">SERVER-12710</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 12 Jun 2013 16:25:40 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        10 years, 20 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            10 years, 20 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>antoine</customfieldvalue>
            <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>dan@mongodb.com</customfieldvalue>
            <customfieldvalue>randolph@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrmqgf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrr3fb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>71912</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsg047:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>