<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:21:52 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-29810] Mongos no more refreshing chunks and trying impossible splits</title>
                <link>https://jira.mongodb.org/browse/SERVER-29810</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;Regularly our mongos stop refreshing chunks from config serv for some collections. And when trying to split chunk, produces &quot;IncompatibleShardingMetadata: Unable to find chunk with the exact bounds&quot; if the chunk was already split by another mongos.&lt;/p&gt;

&lt;p&gt;Our Mongo cluster details :&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Many shards + config replica set, each formed by 3 members (1 primary + 2 secondary)&lt;/li&gt;
	&lt;li&gt;2 mongos&lt;/li&gt;
	&lt;li&gt;Balancer is disabled&lt;/li&gt;
	&lt;li&gt;Package version 3.4.4, OS: Debian 8 Jessie&lt;/li&gt;
	&lt;li&gt;Servers: 6 cores Xeon CPU, 64GB RAM, ~3To SSD, ext4 file system&lt;/li&gt;
	&lt;li&gt;~ 40 collections in 1 DB&lt;/li&gt;
	&lt;li&gt;Many writes and reads&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Classic scenario (shard, collection and fields names and values was replaced) :&lt;/p&gt;

&lt;p&gt;From mongos A logs :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-21T14:46:42.087+0200 I SHARDING [conn6] Refreshing chunks for collection stats.collectionName based on version 9743|18393||5320f5e96789f4d11460c4a0&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-21T14:46:42.129+0200 I SHARDING [CatalogCacheLoader-1] Refresh for collection stats.collectionName took 42 ms and found version 9743|18393||5320f5e96789f4d11460c4a0&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;From mongos B logs :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-21T14:51:05.844+0200 I SHARDING [conn3103094] autosplitted stats.collectionName chunk: shard: shardName, lastmod: 9743|18367||5320f5e96789f4d11460c4a0, [{ _id: { d: 20170621, a: 78, c: 909090, d: 12345678 } }, { _id: { d: 20170621, a: 4444, b: 111111111, c: 222222, d: 3333333 } }) into 3 parts (splitThreshold 67108864) (migrate suggested, but no migrations allowed)&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;From mongos A logs :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-21T14:55:21.233+0200 I SHARDING [conn379] Split chunk { splitChunk: &quot;stats.collectionName&quot;, configdb: &quot;csReplSet/172.16.18.28:27025,172.16.18.3:27025,172.16.18.30:27025&quot;, from: &quot;shardName&quot;, keyPattern: { _id: 1.0 }, shardVersion: [ Timestamp 9743000|149465, ObjectId(&apos;5320f5e96789f4d11460c4a0&apos;) ], min: { _id: { d: 20170621, a: 4444, b: 111111111, c: 222222, d: 3333333 } }, max: { _id: { d: 20170621, a: 4444, b: 121212121, c: 343434, d: 5656565 } }, splitKeys: [ { _id: { d: 20170621, a: 4444, b: 555555555, c: 666666, d: 7777777 } }, { _id: { d: 20170621, a: 4444, b: 888888888, c: 999, d: 000000 } } ] } failed :: caused by :: IncompatibleShardingMetadata: *Unable to find chunk with the exact bounds* [{ _id: { d: 20170621, a: 4444, b: 111111111, c: 222222, d: 3333333 } }, { _id: { d: 20170621, a: 4444, b: 121212121, c: 343434, d: 5656565 } }) at collection version 9743|18399||5320f5e96789f4d11460c4a0&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;We can see that between refresh chunk and split try on mongos A, the other mongos already split that chunk. So the split try faild.&lt;/p&gt;

&lt;p&gt;The problem is that sometimes a mongos suddenly stops to refresh a collection until we restart / force it, so for a long time. And in that cases after few days the mongos is doing bigger and bigger split tries :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-20T11:39:15.052+0200 I SHARDING [conn2766148] warning: log line attempted (53kB) over max size (10kB), printing beginning and end ... Split chunk { splitChunk: &quot;stats.collectionName&quot;, configdb: &quot;csReplSet/172.16.18.28:27025,172.16.18.3:27025,172.16.18.30:27025&quot;, from: &quot;shardName&quot;, keyPattern: { _id: 1.0 }, shardVersion: [ Timestamp 3000|18274, ObjectId(&apos;5667717d46b7ddcd61ef5459&apos;) ], min: { _id: { d: 20170611, a: 111111, b: 2222222, c: 333, d: 333 } }, max: { _id: MaxKey }, splitKeys: [ ...... VERY LONG KEYS LIST ...... ] } failed :: caused by :: IncompatibleShardingMetadata: Unable to find chunk with the exact bounds [{ _id: { d: 20170611, a: 111111, b: 2222222, c: 333, d: 333 } }, { _id: MaxKey }) at collection version 3|19540||5667717d46b7ddcd61ef5459&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;&quot;_id.d&quot; is the insert date, here was 20170611 but as you can see the log entry date is 2017-06-20. The diff is 9 days, 9 days of failed split tries. During this period, we found no chunk refresh in logs for the concerned collection. Theses big split tries slows a lot our shards (long splitVector queries on primary members) which is very troublesome for us.&lt;/p&gt;

&lt;p&gt;So we have to execute regularly a db.adminCommand(&quot;flushRouterConfig&quot;) on mongos to force refresh.&lt;/p&gt;

&lt;p&gt;Thank you in advance for your help.&lt;/p&gt;

&lt;p&gt;Best regards,&lt;br/&gt;
Slawomir&lt;/p&gt;</description>
                <environment></environment>
        <key id="397411">SERVER-29810</key>
            <summary>Mongos no more refreshing chunks and trying impossible splits</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="esha.maharishi@mongodb.com">Esha Maharishi</assignee>
                                    <reporter username="slluk-sa">Slawomir Lukiewski</reporter>
                        <labels>
                    </labels>
                <created>Fri, 23 Jun 2017 11:00:20 +0000</created>
                <updated>Sat, 29 Jul 2017 16:23:01 +0000</updated>
                            <resolved>Fri, 23 Jun 2017 15:59:58 +0000</resolved>
                                    <version>3.4.4</version>
                                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="1605302" author="kaloian.manassiev" created="Fri, 23 Jun 2017 15:59:48 +0000"  >&lt;p&gt;We have confirmed that this is indeed the same problem as &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-28418&quot; title=&quot;make the split command on mongod return a stale version error if the requested chunk bounds are not found&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-28418&quot;&gt;&lt;del&gt;SERVER-28418&lt;/del&gt;&lt;/a&gt; so I am closing it as duplicate.&lt;/p&gt;

&lt;p&gt;Please follow &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-28418&quot; title=&quot;make the split command on mongod return a stale version error if the requested chunk bounds are not found&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-28418&quot;&gt;&lt;del&gt;SERVER-28418&lt;/del&gt;&lt;/a&gt; for more information on when it gets released.&lt;/p&gt;

&lt;p&gt;Best regards,&lt;br/&gt;
-Kal.&lt;/p&gt;</comment>
                            <comment id="1605239" author="slluk-sa" created="Fri, 23 Jun 2017 15:02:54 +0000"  >&lt;p&gt;Hi Kaloian,&lt;/p&gt;

&lt;p&gt;Thanks for your answer !&lt;br/&gt;
Yes, this problem probably started happening only after upgrading to 3.4.4. We stayed only few weeks on 3.4.3 (before was 3.2.8) so not 100% sure but the chunk refreshes optimization you&apos;re talking about seems to fit our case. On 3.2.8, we was also experiencing some IncompatibleShardingMetadata errors but it wasn&apos;t a real problem because refreshes where more regular.&lt;/p&gt;

&lt;p&gt;Yes &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-28418&quot; title=&quot;make the split command on mongod return a stale version error if the requested chunk bounds are not found&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-28418&quot;&gt;&lt;del&gt;SERVER-28418&lt;/del&gt;&lt;/a&gt; should be a very well fix !&lt;br/&gt;
I look forward to the 3.4.6 release !&lt;/p&gt;

&lt;p&gt;Best regards,&lt;br/&gt;
Slawomir&lt;/p&gt;</comment>
                            <comment id="1605194" author="kaloian.manassiev" created="Fri, 23 Jun 2017 14:22:37 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=slluk-sa&quot; class=&quot;user-hover&quot; rel=&quot;slluk-sa&quot;&gt;slluk-sa&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for reporting this issue and sorry for the inconvenience it is causing you having to manually refresh the routing cache.&lt;/p&gt;

&lt;p&gt;Please correct me if I am wrong, but this problem should have started happening only after you upgraded to 3.4.4 - is that correct? In that version we optimized chunk refreshes which were happening too frequently and this is one of the use cases which got regressed. I believe this is a duplicate of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-28418&quot; title=&quot;make the split command on mongod return a stale version error if the requested chunk bounds are not found&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-28418&quot;&gt;&lt;del&gt;SERVER-28418&lt;/del&gt;&lt;/a&gt; which we have fixed in the latest master branch and it is waiting to be backported to version 3.4.6.&lt;/p&gt;

&lt;p&gt;We&apos;ll look into it and report the details here.&lt;/p&gt;

&lt;p&gt;Thanks again for your report.&lt;/p&gt;

&lt;p&gt;Best regards,&lt;br/&gt;
-Kal.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="366748">SERVER-28418</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 23 Jun 2017 14:22:37 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 33 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>backlog-server-pm</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 33 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>esha.maharishi@mongodb.com</customfieldvalue>
            <customfieldvalue>kaloian.manassiev@mongodb.com</customfieldvalue>
            <customfieldvalue>slluk-sa</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht9s9z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|ht1rd3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht9ecf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>