<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:35:42 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-14704] Chunk migration become (linearly?) slower as collection grows</title>
                <link>https://jira.mongodb.org/browse/SERVER-14704</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;In my use case I&apos;ve multiple databases with the same &quot;schemas&quot; and type of data. I&apos;ve noticed that chunk migration becomes slower and slower, in correlation with the collection size.&lt;/p&gt;

&lt;p&gt;For small databases/collections, migrating a chunk is generally done in less than 20 seconds while for my bigger collections it takes 1800 seconds in average (sometimes more than 1 hour), with all nuances between them (I&apos;ve about 35 identical databases, with all sizes). Chunks have roughly the same size and number of documents in all cases, with exactly the same indexes.&lt;/p&gt;

&lt;p&gt;Updates/Inserts are happening, but at a slow pace (I&apos;d say less than 10 updates/inserts per hour are happening on the chunk being migrated).&lt;br/&gt;
My chunks are 256MB and each document have an average size of 2810 bytes (about 50,000 documents per chunk / 140MB as it seems chunks aren&apos;t &quot;full&quot;). The cluster doesn&apos;t receive a lot of writes (globally about 30 updates and 5 inserts per second) and I transferred as many reads as possible to secondaries. Almost 0 deletes are happening, cluster wide&lt;/p&gt;

&lt;p&gt;All disks are regular SATA (because of dataset size).&lt;/p&gt;

&lt;p&gt;Exemple of a low migration:&lt;br/&gt;
&quot;step 1 of 6&quot; : 119,&lt;br/&gt;
&quot;step 2 of 6&quot; : 3266,&lt;br/&gt;
&quot;step 3 of 6&quot; : 1618,&lt;br/&gt;
&quot;step 4 of 6&quot; : 2597284,&lt;br/&gt;
&quot;step 5 of 6&quot; : 2733,&lt;br/&gt;
&quot;step 6 of 6&quot; : 0&lt;/p&gt;

&lt;p&gt;Data do not fit in RAM (but indexes does).&lt;br/&gt;
When I look at the logs of the &quot;sender&quot;, I can see that &quot;cloned&quot;/&quot;clonedBytes&quot; are increasing very slowly and pauses every 16MB or so for few seconds.&lt;/p&gt;

&lt;p&gt;iotop tells me that both the sender and the recipient are performing a lot of &lt;b&gt;writes&lt;/b&gt; (both stuck at 100%). Magnitudes more than what is being transmitted.&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;The sender *&lt;br/&gt;
It&apos;s a basic 16GB of RAM / soft RAID 1 SATA disks server&lt;br/&gt;
On the sender I&apos;d expect high reads/low writes (as the range deleter removes the previously transmitted chunks). Due to data locality I&apos;d probably expect reads to be slower in big collections, but definitively don&apos;t expect that amount of writes.&lt;br/&gt;
Typical &quot;atop&quot; output:&lt;br/&gt;
DSK |          sda  | busy    100% | read     130  | write   2635 |  MBr/s   0.13 | MBw/s   2.07 |  avio 3.62 ms |&lt;br/&gt;
DSK |          sdb  | busy     81% | read      83  | write   2613 |  MBr/s   0.09 | MBw/s   2.04 |  avio 3.00 ms |&lt;/li&gt;
&lt;/ul&gt;



&lt;ul&gt;
	&lt;li&gt;The recipient *&lt;br/&gt;
96 GB of RAM / hard RAID 1 SATA disks&lt;br/&gt;
I&apos;m moving all my data to this new server (I&apos;ll end with a cluster with a single shard... but this server have 2 times more RAM than the previously combine 3 shards - 3x16=48GB vs 96 GB)&lt;br/&gt;
On the recipient I&apos;d expect writes in correlation with the chunk data being migrated. This server was synced from its replicaset about 1 week ago, so it&apos;s very clean in data locality, no holes in files (it wasn&apos;t &quot;bootstraped&quot;).&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;DSK |          sda  | busy    100% | read     152  | write   2664 |  MBr/s   0.36 | MBw/s  11.84 |  avio 3.55 ms |&lt;/p&gt;

&lt;p&gt;You can probably find more insights in my MMS account: &lt;a href=&quot;https://mms.mongodb.com/host/cluster/51a2dc5c7fe227e9f188c509/52bb9a10e4b0256ace50e0d3&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://mms.mongodb.com/host/cluster/51a2dc5c7fe227e9f188c509/52bb9a10e4b0256ace50e0d3&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Have a look to the log extract for a typical overview of chunk migration speed.&lt;/p&gt;</description>
                <environment></environment>
        <key id="149381">SERVER-14704</key>
            <summary>Chunk migration become (linearly?) slower as collection grows</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="ramon.fernandez@mongodb.com">Ramon Fernandez Marina</assignee>
                                    <reporter username="tubededentifrice">Vincent</reporter>
                        <labels>
                    </labels>
                <created>Sun, 27 Jul 2014 20:54:45 +0000</created>
                <updated>Wed, 10 Dec 2014 23:19:29 +0000</updated>
                            <resolved>Tue, 19 Aug 2014 23:06:59 +0000</resolved>
                                    <version>2.6.3</version>
                                                    <component>Sharding</component>
                                        <votes>1</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="764496" author="tubededentifrice" created="Fri, 14 Nov 2014 16:43:18 +0000"  >&lt;p&gt;For reference, I solved this issue by switching to servers with less RAM (32GB instead of 96GB) but equipped with SSD disks. It dramatically improved performances of my application and chunk migrations are now blazing fast, even if my data doesn&apos;t fit in RAM (not even my indexes).&lt;/p&gt;</comment>
                            <comment id="696051" author="ramon.fernandez" created="Tue, 19 Aug 2014 23:06:31 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tubededentifrice&quot; class=&quot;user-hover&quot; rel=&quot;tubededentifrice&quot;&gt;tubededentifrice&lt;/a&gt;, thanks for sending your config database. There are several factors that can affect migrations, for example:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Network traffic&lt;/li&gt;
	&lt;li&gt;Amount of data in each chunk; note that a chunk is a logical range, and the actual data in each range may be very different from the chunk size&lt;/li&gt;
	&lt;li&gt;Misconfigured disks: data writes involve lots of random I/O, so an aggressive read-ahead configuration may harm performance&lt;/li&gt;
	&lt;li&gt;Choice of shard key: sharding on {&lt;tt&gt;_id:1&lt;/tt&gt;} causes all writes to go to the same shard, thus slowing down migrations involving that shard&lt;/li&gt;
	&lt;li&gt;Application I/O load on the sending and/or receiving end, even with large amounts of RAM&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;After examining the data you sent we haven&apos;t found any evidence of a bug in MongoDB. Since the SERVER project is for reporting bugs or feature suggestions for the MongoDB server and tools, I would recommend that you post your questions on the &lt;a href=&quot;http://groups.google.com/group/mongodb-user&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;mongodb-user group&lt;/a&gt;, where you can reach a wide audience of MongoDB experts.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="676016" author="tubededentifrice" created="Wed, 30 Jul 2014 05:33:06 +0000"  >&lt;p&gt;It looks like the writes on the receiving side are caused by application updates thrown, not by the chunk migration, my apologies.&lt;br/&gt;
However, when the application is stopped and no write is happening, the chunk migration isn&apos;t faster (and the sender side is still throwing out a lot of reads+writes that are, for sure, related to the chunk migration in progress).&lt;/p&gt;</comment>
                            <comment id="675847" author="tubededentifrice" created="Wed, 30 Jul 2014 01:20:55 +0000"  >&lt;p&gt;Note that I already had this problem when I switched from 1 to 5 and then 5 to 3 shards (migration took &lt;b&gt;really&lt;/b&gt; forever, with about 1 chunk (64 MB at the time...) moved per 24 hour &amp;#8211; the hardware used was much less powerful at the time).&lt;br/&gt;
I ended up renting few very big SSD cloud servers just to speed up the migrations!&lt;br/&gt;
But here, the hardware should not be the limiting factor on the receiving side: 96GB of RAM, about 60GB total index size for the entire dataset (including data not yet on that shard), hardware RAID with 512Mo of cache with BBU and it still performs way to much NON SEQUENTIAL writes.&lt;/p&gt;</comment>
                            <comment id="675749" author="tubededentifrice" created="Tue, 29 Jul 2014 22:43:45 +0000"  >&lt;p&gt;Hi Ramon,&lt;br/&gt;
1. I did the dump, how can I send it to you privately? I don&apos;t want to disclose architecture/site data publicly. This would also allow me to disclose you things without having to anonymize them before. Maybe I could send you a dump of sender/receiver logs as well?&lt;/p&gt;

&lt;p&gt;2. Nope, initially it was a simple RS without sharding. Then I had 5 shards, then 3 and now I&apos;m moving all the data to have only 1 shard (maybe 2 shards then, etc.). Chunks are moved using moveChunk commands (because I can only have 1 draining shard, and it would be dumb to have the chunks moved to the other shard I want to remove!)&lt;/p&gt;

&lt;p&gt;3. I had some &quot;jumbo&quot; chunks in the past, which were able to move with 256MB. Beside this, the doc state(d ?) that chunk migration was more efficient with big chunks and put less stress on mongos (I had an issue with this too...) at the cost of a less evenly balanced cluster and more &quot;painful&quot; migrations, which I don&apos;t really care about. (moving a 256MB chunk take less time than moving 4x 64MB chunks, isn&apos;t?)&lt;/p&gt;

&lt;p&gt;4. I keep a &quot;tail -f&quot; on the logs =&amp;gt; shardKeyPattern: &lt;/p&gt;
{ _id: 1.0 }
&lt;p&gt;, state: &quot;clone&quot;, counts: &lt;/p&gt;
{ cloned: 24470, clonedBytes: 73203286, catchup: 0, steady: 0 }
&lt;p&gt; ; the numbers I wrote may not be 100% accurate but gives a good idea of the reality.&lt;/p&gt;


&lt;p&gt;Edit: I&apos;ve attached the dump to this ticket, restricted to project users&lt;/p&gt;</comment>
                            <comment id="675711" author="ramon.fernandez" created="Tue, 29 Jul 2014 22:05:52 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tubededentifrice&quot; class=&quot;user-hover&quot; rel=&quot;tubededentifrice&quot;&gt;tubededentifrice&lt;/a&gt;, we&apos;ll need more information to determine whether there&apos;s a bug here:&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;Can you send us a dump of your config metadata so we can investigate further? You can get it using &lt;tt&gt;mongodump&lt;/tt&gt; against a &lt;tt&gt;mongos&lt;/tt&gt; as follows:
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;mongodump -d config --host &amp;lt;mongos host:port&amp;gt;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;
&lt;p&gt;This metadata contains the chunk migration history, so we can correlate migration speed with the information in MMS.&lt;/p&gt;&lt;/li&gt;
	&lt;li&gt;Did you pre-split your chunks before inserting data?&lt;/li&gt;
	&lt;li&gt;Can you elaborate on the reason for choosing your chunksize?&lt;/li&gt;
	&lt;li&gt;How do you compute the amount of data in one chunk?&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="673018" author="tubededentifrice" created="Sun, 27 Jul 2014 23:56:32 +0000"  >&lt;p&gt;I forgot to mention: the 3rd shard (the one that is not yet involved in chunk migration but still holds ~30% of the &quot;big&quot; collection I&apos;m moving) is resting:&lt;br/&gt;
DSK |          sda |               | busy      8% |  read       0 |              |  write    231 | KiB/r      0 |               | KiB/w      5 | MBr/s   0.00  |              | MBw/s   0.13  | avq    54.11 |               | avio 3.43 ms &lt;/p&gt;

&lt;p&gt;which makes me conclude it&apos;s not a mater of what other operations are being performed in the database, but only a mater of the chunks being migrated.&lt;br/&gt;
(this shard behaves exactly as the others when it sends a chunk)&lt;/p&gt;


&lt;p&gt;And also: all filesystems are ext4&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="48640" name="log extract.log" size="12493" author="tubededentifrice" created="Sun, 27 Jul 2014 20:54:45 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 29 Jul 2014 22:05:52 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        9 years, 13 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            9 years, 13 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>tubededentifrice</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlqx3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hs0v9r:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>129525</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlm8n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>