<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:31:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-13320] foreground index builds are much slower than background on large collections</title>
                <link>https://jira.mongodb.org/browse/SERVER-13320</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Ostensibly, a main reason for separate foreground/background builds is that foreground index builds are faster, at the cost of blocking the server &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt;. However, we have observed both in production and in synthetic benchmarks, that foreground index builds are often &lt;b&gt;much&lt;/b&gt; slower on large collections.&lt;/p&gt;

&lt;p&gt;In a synthetic benchmark, building a trivial index on a collection with about 6M records, each about 1k large, I saw the following numbers:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[2014-03-21 00:05:10,042 22310|INFO] Done. items=6250000 fg=145.1 bg=94.7&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[2014-03-21 00:09:01,545 22310|INFO] Done. items=6250000 fg=135.7 bg=95.4&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[2014-03-21 00:12:43,822 22310|INFO] Done. items=6250000 fg=125.1 bg=96.9&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[2014-03-21 00:16:25,450 22310|INFO] Done. items=6250000 fg=125.6 bg=95.8&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[2014-03-21 00:20:00,567 22310|INFO] Done. items=6250000 fg=122.5 bg=92.3&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;The &quot;fg=&quot; number is seconds to build an index in the foreground, and &quot;bg=&quot; is for a background build. The requested indexes are identical, and are dropped each time.&lt;/p&gt;

&lt;p&gt;I don&apos;t have hard numbers right now, but experience in production suggests that the difference only gets worse as the collection gets even bigger.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;a href=&quot;http://docs.mongodb.org/manual/tutorial/build-indexes-in-the-background/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/tutorial/build-indexes-in-the-background/&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="124295">SERVER-13320</key>
            <summary>foreground index builds are much slower than background on large collections</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="ramon.fernandez@mongodb.com">Ramon Fernandez Marina</assignee>
                                    <reporter username="nelhage">Nelson Elhage</reporter>
                        <labels>
                    </labels>
                <created>Sun, 23 Mar 2014 17:42:52 +0000</created>
                <updated>Mon, 11 Jul 2016 17:18:29 +0000</updated>
                            <resolved>Wed, 3 Dec 2014 20:32:26 +0000</resolved>
                                    <version>2.4.4</version>
                    <version>2.6.0-rc1</version>
                                    <fixVersion>2.6.0</fixVersion>
                                    <component>Index Maintenance</component>
                                        <votes>0</votes>
                                    <watches>21</watches>
                                                                                                                <comments>
                            <comment id="777687" author="ramon.fernandez" created="Wed, 3 Dec 2014 19:41:41 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=nelhage&quot; class=&quot;user-hover&quot; rel=&quot;nelhage&quot;&gt;nelhage&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;our reading of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-13320?focusedCommentId=521392&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-521392&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;your comment from March 23&lt;/a&gt; was that you were still seeing background indexes being faster, and while we were not quite able to reproduce in a test setup we initiated a more formal testing process to get to the bottom of the issue &amp;#8211; because foreground indexes should ideally be faster than background indexes.&lt;/p&gt;

&lt;p&gt;That being said, we can indeed close this issue as &quot;Fixed&quot; (issue was present in 2.4 but is not in 2.6), and update the ticket if the results of the formal testing show any problems or discrepancies.&lt;/p&gt;

&lt;p&gt;Thanks for your patience with this matter, and thanks for using MongoDB.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="777667" author="nelhage" created="Wed, 3 Dec 2014 19:30:18 +0000"  >&lt;p&gt;So my latest understanding is that this issue was resolved in 2.6 by the rewrite of the external sort code, which I thought I expressed in my March 23 comment, but I may not have been sufficiently clear. It&apos;s still outstanding in 2.4, but probably not worth backporting an invasive fix for, so I suspect this should just be closed.&lt;/p&gt;</comment>
                            <comment id="777664" author="ramon.fernandez" created="Wed, 3 Dec 2014 19:28:15 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=nelhage&quot; class=&quot;user-hover&quot; rel=&quot;nelhage&quot;&gt;nelhage&lt;/a&gt;, just a quick note to let you know that this ticket is still under investigation, and so far we haven&apos;t been able to confirm the behavior you describe. We&apos;ll keep the ticket open until testing of this issue is complete.&lt;/p&gt;

&lt;p&gt;I also see that this ticket was opened against a 2.6.0-rc1 release candidate. Have you had a chance to test a full release and see if the issue still persists? The latest stable release in the 2.6 series is 2.6.5, and 2.6.6 is schedule for the coming weeks.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="712068" author="ramon.fernandez" created="Fri, 5 Sep 2014 02:12:37 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=nelhage&quot; class=&quot;user-hover&quot; rel=&quot;nelhage&quot;&gt;nelhage&lt;/a&gt;, our measurements so far do not provide any evidence that foreground index builds are slower than background index builds, but we&apos;re still investigating the issue in depth. We&apos;ll update this ticket when we have results on this investigation.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="523323" author="victor.hooi" created="Wed, 26 Mar 2014 00:24:40 +0000"  >&lt;p&gt;Hi Nelson,&lt;/p&gt;

&lt;p&gt;Just letting you know - we&apos;re getting some conflicting results, and are still looking into this.&lt;/p&gt;

&lt;p&gt;Also, I saw your last comment about 2.4.x &lt;del&gt;v&lt;/del&gt; 2.6.x - we&apos;re also noticing the effect is more pronounced on 2.4.x, and much smaller on 2.6.x.&lt;/p&gt;

&lt;p&gt;I&apos;ll update this ticket when we have something conclusive.&lt;/p&gt;

&lt;p&gt;Cheers,&lt;br/&gt;
Victor&lt;/p&gt;</comment>
                            <comment id="521392" author="nelhage" created="Mon, 24 Mar 2014 02:00:07 +0000"  >&lt;p&gt;Ah-ha. Spent a while looking at the source during between the last two comments, now I understand what&apos;s going on.&lt;/p&gt;

&lt;p&gt;On 2.4, the external sort divides the data into a number of chunks, each of which fits in RAM, dumps each to a separate file, and does a na&#239;ve merge, which will be O(n&#178;) if the data to be sorted is much larger than RAM.&lt;/p&gt;

&lt;p&gt;On 2.6, the code has been updated to replace the na&#239;ve merge with a min-heap, which should have much better asymptotics. So there&apos;s still some question of why my tests show the 2.6 fg sort as slightly slower than bg, but at least the most egregious problem is resolved in 2.6. Feel free to close this as already resolved or wontfix if the small performance difference is considered uninteresting.&lt;/p&gt;</comment>
                            <comment id="521391" author="nelhage" created="Mon, 24 Mar 2014 01:54:23 +0000"  >&lt;p&gt;Hm &amp;#8211; looking at my results again, I hadn&apos;t looked at the 2.6 numbers closely enough &amp;#8211; fg is still worse, but not nearly as dramatically. It&apos;s possible the really bad root issues are resolved there...&lt;/p&gt;</comment>
                            <comment id="521382" author="nelhage" created="Mon, 24 Mar 2014 01:21:18 +0000"  >&lt;p&gt;The test runs I pasted output from are on ec2 m1.large instances, which have 7.5G of RAM and 2x 2.27GHz Xeons. I ran the tests against the instance storage, which is essentially local disk. I saw similar results in tests on my laptop, which is a Thinkpad with an SSD, 8G of RAM, and a recent Intel Core processor (I can be more specific once I&apos;m back in the office).&lt;/p&gt;

&lt;p&gt;I don&apos;t have log files saved from test runs. I&apos;m happy to provide them &amp;#8211; which mongo version would be most useful for you?&lt;/p&gt;

&lt;p&gt;The test environments I&apos;m doing synthetic benchmarks on are not monitored by mms. This investigation was triggered on our end by several index builds that took &lt;b&gt;days&lt;/b&gt; on a secondary, and some of those nodes are monitored by mms. For instance, &lt;a href=&quot;https://mms.mongodb.com/host/detail/4f14ac99dc514c3e10871e49/e9ddc87371c836ac7f78cf923344604f&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://mms.mongodb.com/host/detail/4f14ac99dc514c3e10871e49/e9ddc87371c836ac7f78cf923344604f&lt;/a&gt; spent most of the time from March 12 to March 18 doing index builds.&lt;/p&gt;</comment>
                            <comment id="521376" author="victor.hooi" created="Mon, 24 Mar 2014 00:49:36 +0000"  >&lt;p&gt;Hi Nelson,&lt;/p&gt;

&lt;p&gt;Thank you for the detailed report, and the Github test script.&lt;/p&gt;

&lt;p&gt;I&apos;m currently attempting to reproduce this on my system, and would appreciate if you could provide some additional information.&lt;/p&gt;

&lt;p&gt;Firstly - are you able to provide any more details about your environment? (e.g. CPU, RAM, etc.) And if these machines are monitored on our cloud MMS, are you able to provide a link to them please?&lt;/p&gt;

&lt;p&gt;Secondly, are you able to provide the logfiles during one of the test runs please? We&apos;d like to check for any MongoDB anomalies during the index builds.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Victor&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Sun, 23 Mar 2014 21:21:44 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        9 years, 11 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            9 years, 11 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>nelhage</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>victor.hooi</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlya7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrwxg7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>106165</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;Clone &lt;a href=&quot;https://github.com/nelhage/mongod-tests&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/nelhage/mongod-tests&lt;/a&gt;, and run &quot;index.py&quot;. This will start a local mongod and begin inserting records into a test collection, stopping periodically to do both fg and bg index builds and report the timing. (Note that this will use unbounded amounts of storage in /tmp &amp;#8211; you can redirect it with TMPDIR=... in the environment).&lt;/p&gt;

&lt;p&gt;Sample output on 2.4.4: &lt;a href=&quot;https://nelhage.com/paste/2014-03-20EdWTQGGH&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://nelhage.com/paste/2014-03-20EdWTQGGH&lt;/a&gt;&lt;br/&gt;
and 2.6.0rc1: &lt;a href=&quot;https://nelhage.com/paste/2014-03-20JXX2L3HT&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://nelhage.com/paste/2014-03-20JXX2L3HT&lt;/a&gt;&lt;/p&gt;</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsh1kf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>