<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:31:43 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-13432] Consider changing index build bucket fill %</title>
                <link>https://jira.mongodb.org/browse/SERVER-13432</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Take an index with keys of near perfect uniform distribution. In the repro, I use an MD5 hash. Perform a reindex on the collection.&lt;br/&gt;
The next many inserts to this collection will cause all the buckets to split, which leads to the index very quickly double in size. This size is often much larger than the index was before reindexing.&lt;/p&gt;

&lt;p&gt;The reindex command should leave the index in a state where it does not almost instantly double in size.&lt;/p&gt;</description>
                <environment></environment>
        <key id="128672">SERVER-13432</key>
            <summary>Consider changing index build bucket fill %</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="backlog-server-execution">Backlog - Storage Execution Team</assignee>
                                    <reporter username="rod.adams@mongodb.com">Rod Adams</reporter>
                        <labels>
                    </labels>
                <created>Tue, 1 Apr 2014 00:21:05 +0000</created>
                <updated>Tue, 6 Dec 2022 05:08:22 +0000</updated>
                            <resolved>Fri, 14 Sep 2018 19:59:51 +0000</resolved>
                                                                    <component>Index Maintenance</component>
                    <component>MMAPv1</component>
                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="1485260" author="milkie" created="Wed, 25 Jan 2017 19:31:58 +0000"  >&lt;p&gt;I believe this request boils down to a change to the foreground index build&apos;s bucket fill percentage (by lowering it, or by adding jitter).  Is that correct?&lt;br/&gt;
Background index builds are equivalent to inserting data records one by one into an empty index, so I believe that if you ran the reindex command for an index that was originally built in the background, you wouldn&apos;t see the same growth behavior.&lt;br/&gt;
I believe that in earlier versions of MongoDB, the bucket file percentage for foreground index builds was set the same as the bucket split percentage, but that ended up wasting a lot of space in indexes that never added new keys within the current keyspace.  The current setting does not waste as much space for such workloads, at the expense of the growth behavior for evenly distributed keys as you have noted in the Description.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="40688" name="repro.py" size="1278" author="rod.adams@mongodb.com" created="Tue, 1 Apr 2014 00:21:05 +0000"/>
                            <attachment id="40689" name="v2.4.9.log" size="3151" author="rod.adams@mongodb.com" created="Tue, 1 Apr 2014 00:21:05 +0000"/>
                            <attachment id="40690" name="v2.6.0-rc2.log" size="3155" author="rod.adams@mongodb.com" created="Tue, 1 Apr 2014 00:21:05 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25136"><![CDATA[Storage Execution]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 27 May 2014 04:13:54 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        7 years, 3 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>false</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>alexander.golin@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            7 years, 3 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-execution</customfieldvalue>
            <customfieldvalue>milkie@mongodb.com</customfieldvalue>
            <customfieldvalue>rod.adams@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlxo7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr8h3z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>110178</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;repro.py is a short python client to demonstrate how the index size per record count behaves under a load of storing random md5 hashes with the following usage pattern:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;insert 200k documents (stabilizes around 75B/doc)&lt;/li&gt;
	&lt;li&gt;resize (drops to 50B/doc)&lt;/li&gt;
	&lt;li&gt;Add 10k documents, in 1k increments (Goes up to 96B/doc)&lt;/li&gt;
	&lt;li&gt;repeats the resize/add 10k a few times, to verify it always happens.&lt;/li&gt;
	&lt;li&gt;Add 200k more documents (Stabilizes at a medium value again)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I included the output of running the repro for 2.6.0-rc2 and 2.4.9.&lt;/p&gt;</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsh0qn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>