<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:22:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-10024] cluster can end up with large chunks that did not get split and will time out on migration</title>
                <link>https://jira.mongodb.org/browse/SERVER-10024</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Consider the case where:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;large volume of insertion&lt;/li&gt;
	&lt;li&gt;migration is slow due to slow hardware and many indices (e.g. 20)&lt;/li&gt;
	&lt;li&gt;consequently moveChunk operation takes a long time (e.g. 1 min)&lt;/li&gt;
	&lt;li&gt;consequently any split fail during that time since the ns is locked, and chunks become larger.&lt;/li&gt;
	&lt;li&gt;consequently chunks become even longer to move... This downward spiral makes thing worse and worse&lt;/li&gt;
	&lt;li&gt;eventually chunks cannot be moved at all. The migration gets aborted after some minutes and no progress is made at all. But the system is super busy all the time trying to migrate those documents.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I think we need several server improvements:&lt;/p&gt;

&lt;p&gt;A. any chunk migration abort due to timeout should result in a split. If anything the split wont hurt. Right now the split seems to be for a specific case only.&lt;/p&gt;

&lt;p&gt;B. ideally the migration process would avoid retrying the same chunk over and over. May need some amount of randomization on candidate chunks.&lt;/p&gt;

&lt;p&gt;C. when mongos fails to split due to NS locked, it should mark the metadata as &quot;needs split&quot; for later. Ideally all &quot;need split&quot; should be cleared before the next migration is attempted.&lt;/p&gt;

&lt;p&gt;This is all to avoid the bad catch 22 problems where large chunks end up clogging the whole system.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="80274">SERVER-10024</key>
            <summary>cluster can end up with large chunks that did not get split and will time out on migration</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="pierlauro.sciarelli@mongodb.com">Pierlauro Sciarelli</assignee>
                                    <reporter username="antoine">Antoine Girbal</reporter>
                        <labels>
                            <label>sharding-wfbf-day</label>
                    </labels>
                <created>Tue, 25 Jun 2013 18:14:59 +0000</created>
                <updated>Fri, 14 Apr 2023 10:21:05 +0000</updated>
                            <resolved>Fri, 14 Apr 2023 10:21:05 +0000</resolved>
                                                                    <component>Sharding</component>
                                        <votes>1</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="5347032" author="pierlauro.sciarelli" created="Fri, 14 Apr 2023 10:19:42 +0000"  >&lt;p&gt;Closing this ticket as gone away because:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;The description is referring a deprecated auto-splitter behavior only present in versions that reached EOL long ago, when the component was on routers.&lt;/li&gt;
	&lt;li&gt;The described problem is surely not present in currently supported versions because: the chunk splitter has been moved on shards in v4.2, that is currently the lowest supported version (not for long, going EOL this month)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;As a side note, the auto-splitter has gone away starting from v6.0 so the pre-splitting solution proposed by Nic is not viable anymore. Quoting &lt;a href=&quot;https://www.mongodb.com/docs/manual/release-notes/6.0/#balancing-policy-changes&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;6.0 release notes&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Starting in MongoDB 6.0.3, data in sharded clusters is distributed based on data size rather than number of chunks. As a result, you should be aware of the following significant changes in sharded cluster data distribution behavior:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;The balancer distributes ranges of data rather than chunks. The balancing policy looks for evenness of data distribution rather than chunk distribution.&lt;/li&gt;
	&lt;li&gt;Chunks are not subject to auto-splitting. Instead, chunks are split only when moved across shards.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;</comment>
                            <comment id="2765589" author="nicholas.cottrell" created="Mon, 27 Jan 2020 15:00:45 +0000"  >&lt;p&gt;In regards to problems with slow migration during special insert workloads, one good solution is to &lt;a href=&quot;https://docs.mongodb.com/manual/tutorial/create-chunks-in-sharded-cluster/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;manually split chunks&lt;/a&gt; prior to starting the workload and allowing the balancer to re-balance the new empty chunks. When you start the import workload, the inserts should then be distributed across all available shards rather than all being send to a single &quot;hot shard&quot; and then migrated in a subsequent step.&lt;/p&gt;</comment>
                            <comment id="397121" author="justanyone" created="Tue, 6 Aug 2013 17:41:50 +0000"  >&lt;p&gt;I&apos;m seeing this behaviour when doing mongorestore of a large database.  I end up with a bunch of unbalanced shards (we have 48 shards) that are not splitting because mongorestore is eating all the IO.  So, balancer isn&apos;t going, splitter sometimes fails, etc.  I end up having to stop the mongorestore, restart daemons/mongos processes, stopBalancer()/startBalancer()/setBalancerState(false)-wait-2-minutes-setBalancerState(true), etc., etc., until the balancer decides to start working, wait for it to balance, then start the mongorestore again with a properly split and balanced set of data.&lt;/p&gt;
</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="972396">SERVER-44088</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25133"><![CDATA[Sharding EMEA]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 6 Aug 2013 17:41:50 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        42 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>pierlauro.sciarelli@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            42 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>antoine</customfieldvalue>
            <customfieldvalue>justanyone</customfieldvalue>
            <customfieldvalue>nicholas.cottrell@mongodb.com</customfieldvalue>
            <customfieldvalue>pierlauro.sciarelli@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrmp67:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrf87z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7181</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="6818">Sharding EMEA 2023-04-17</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsvnvb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>