<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 07:53:40 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[DOCS-7145] Clarify the manual on sharding existing collection size limit. </title>
                <link>https://jira.mongodb.org/browse/DOCS-7145</link>
                <project id="10380" key="DOCS">Documentation</project>
                    <description>&lt;p&gt;*&lt;b&gt;edit&lt;/b&gt;* added note:  8,192 split point limit does not apply to initial sharding of collection (added note here so the description of the ticket does not create mistaken impression that it does).&lt;/p&gt;

&lt;p&gt;It would be good to revisit and clarify the manual on :&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.mongodb.org/manual/reference/limits/#Sharding-Existing-Collection-Data-Size&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.org/manual/reference/limits/#Sharding-Existing-Collection-Data-Size&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We should improve the clarification on how the two sizes (256GB and 400GB) were calculated/estimated. &lt;/p&gt;

&lt;p&gt;Also It would be great to revisit the table: &lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Clarifying how the number of splits are calculated.&lt;/li&gt;
	&lt;li&gt;Removing the 1MB chunk max collection size. As in the context of sharding an existing collection, generally it would be beneficial to increase the chunk size rather than reducing it.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;NOTE: For 3.4, need to revisit this as there are limits for &lt;b&gt;empty collections with hashed shard keys&lt;/b&gt;. &lt;/p&gt;
</description>
                <environment></environment>
        <key id="265049">DOCS-7145</key>
            <summary>Clarify the manual on sharding existing collection size limit. </summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="ravind.kumar">Ravind Kumar</assignee>
                                    <reporter username="wan.bachtiar@mongodb.com">Wan Bachtiar</reporter>
                        <labels>
                    </labels>
                <created>Fri, 12 Feb 2016 12:25:53 +0000</created>
                <updated>Mon, 30 Oct 2023 21:36:32 +0000</updated>
                            <resolved>Thu, 9 Jun 2016 15:39:44 +0000</resolved>
                                                    <fixVersion>Server_Docs_20231030</fixVersion>
                                    <component>manual</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="1294343" author="renctan" created="Tue, 14 Jun 2016 21:03:28 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ravind.kumar&quot; class=&quot;user-hover&quot; rel=&quot;ravind.kumar&quot;&gt;ravind.kumar&lt;/a&gt; The numInitialChunks hard limit is only for v3.4. For the rest, they should be the same for old versions of mongo.&lt;/p&gt;</comment>
                            <comment id="1292582" author="ravind.kumar" created="Mon, 13 Jun 2016 19:04:58 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=renctan&quot; class=&quot;user-hover&quot; rel=&quot;renctan&quot;&gt;renctan&lt;/a&gt;, Is there anything here that would not apply to 3.0, or possibly 2.6? This might be worth backporting.&lt;/p&gt;</comment>
                            <comment id="1292572" author="xgen-internal-githook" created="Mon, 13 Jun 2016 18:58:37 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;rkumar-mongo&apos;, u&apos;name&apos;: u&apos;ravind&apos;, u&apos;email&apos;: u&apos;ravind.kumar@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-7145&quot; title=&quot;Clarify the manual on sharding existing collection size limit. &quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-7145&quot;&gt;&lt;del&gt;DOCS-7145&lt;/del&gt;&lt;/a&gt;: limits for sharding existing data&lt;/p&gt;

&lt;p&gt;Signed-off-by: kay &amp;lt;kay.kim@10gen.com&amp;gt;&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/docs/commit/7b6fda5517da0a730c5b5e917038e2038f05d109&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/docs/commit/7b6fda5517da0a730c5b5e917038e2038f05d109&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="1289358" author="ravind.kumar" created="Thu, 9 Jun 2016 15:39:44 +0000"  >&lt;p&gt;&lt;a href=&quot;https://github.com/mongodb/docs/pull/2643&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/docs/pull/2643&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="1285131" author="renctan" created="Mon, 6 Jun 2016 15:31:17 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Randolph Tan, apologies for leaving out the context. I was referring to comments on &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-7145&quot; title=&quot;Clarify the manual on sharding existing collection size limit. &quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-7145&quot;&gt;&lt;del&gt;DOCS-7145&lt;/del&gt;&lt;/a&gt;:comments and HELP-1859:comment . &lt;br/&gt;
Thanks for clearing that up, I appreciate it.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;In the first comment, I believe Asya was referring to sharding a collection with existing data. In this scenario, mongos will create new chunks for the collection (as if splitting min-&amp;gt;max to several chunks). The second one refers to the user calling the split command explicitly. Note that there is a special case where the 8192 limit applies and this was demonstrated in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-22430&quot; title=&quot;Validate the numInitialChunks parameter for &amp;#39;shardcollection&amp;#39; earlier&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-22430&quot;&gt;&lt;del&gt;SERVER-22430&lt;/del&gt;&lt;/a&gt;: sharding an empty collection with a hashed key and specifying with numInitialChunks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Also in the second calculation, this line: splitPointsRequired = &amp;lt;average document size&amp;gt; / &amp;lt;shard key size&amp;gt; doesn&apos;t quite make sense to me. Generally speaking, a document would contain one shard key.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;If my understanding of the formula is correct, both &quot;size&quot; refers to the BSON Object size. If that&apos;s the case, I also don&apos;t follow how this formula came about.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Which means for v3.2 docs, it&apos;s only a minimum value between 1,000,000 and max BSON size/shard key size.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Note: The 1000000 limit was added together with the 8192*nShards limit. In other words, this check did not exist in v3.2.&lt;/p&gt;
</comment>
                            <comment id="1282704" author="ravind.kumar" created="Thu, 2 Jun 2016 18:05:12 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=wan.bachtiar&quot; class=&quot;user-hover&quot; rel=&quot;wan.bachtiar&quot;&gt;wan.bachtiar&lt;/a&gt;, my understanding is that the first calculation is using the maximum BSON document size of 16MB. While this is good for estimating maximum collection size based on shard key size and chunk size, I imagine most customers do not approach that limit very often. So the second formula would just be the average document size of the target collection instead of the maximum BSON document size. &lt;/p&gt;

&lt;p&gt;For example, a 64 bit shard key with a 64 MB chunk size would allow for up to 8TB of size, requiring 16 shards to support every split point (3.4+). But if the customer has an average document size of only 4MB, the number of split points would be 4x lower, as would be the number of shards. I wouldn&apos;t want a customer to view the formula / table and end up with a much larger number of shards than they need for their collection. Maybe I&apos;m over-estimating the issue here. &lt;/p&gt;</comment>
                            <comment id="1275369" author="ravind.kumar" created="Wed, 25 May 2016 19:30:54 +0000"  >&lt;p&gt;I&apos;ve updated the code review based on some of the discussions here. Please review when you get a chance. &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=wan.bachtiar&quot; class=&quot;user-hover&quot; rel=&quot;wan.bachtiar&quot;&gt;wan.bachtiar&lt;/a&gt;, I folded in the number of shards as a measure of the minimum number needed to support a given number of split points.&lt;/p&gt;</comment>
                            <comment id="1221706" author="asya" created="Thu, 31 Mar 2016 20:32:32 +0000"  >&lt;blockquote&gt;&lt;p&gt;does the maximum split limit also effect existing sharded clusters that are growing significantly, or just existing collections that need to be sharded?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;All of this discussion is &lt;b&gt;ONLY&lt;/b&gt; applicable to enabling sharding on an existing non-sharded collection.   None of the discussion applies to already sharded collections.&lt;/p&gt;</comment>
                            <comment id="1211402" author="asya" created="Tue, 22 Mar 2016 16:25:17 +0000"  >&lt;p&gt;please note that this ticket and &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-7254&quot; title=&quot;Comment on: &amp;quot;manual/reference/limits.txt&amp;quot;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-7254&quot;&gt;&lt;del&gt;DOCS-7254&lt;/del&gt;&lt;/a&gt; are marked as duplicates but both are still open.&lt;/p&gt;

&lt;p&gt;Anyway, 8192 is a non-issue for initial sharding of collection.  It&apos;s only a limit for manual running of split command.&lt;/p&gt;
</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="267920">DOCS-7254</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 22 Mar 2016 16:16:49 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        7 years, 35 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emet.ozar@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            7 years, 35 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>asya.kamsky@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>randolph@mongodb.com</customfieldvalue>
            <customfieldvalue>ravind.kumar</customfieldvalue>
            <customfieldvalue>wan.bachtiar@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrn0cf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrcwcf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="1162">Docs Sprint 2016(0711-0729)</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrz75b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                </customfields>
    </item>
</channel>
</rss>