<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:18:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-28422] Cluster stuck because replication heartbeat does not detect hanging members</title>
                <link>https://jira.mongodb.org/browse/SERVER-28422</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We&apos;ve hit a bug that has made our entire MongoDB cluster (15 baremetal replicasets of 2 members + arb each) unresponsive several times.&lt;/p&gt;

&lt;p&gt;Whenever an issue occurs that can make the mongod process hangs, the cluster gets stuck too, and this issue should be detected with the replication heartbeat and provoke a primary switch.&lt;/p&gt;

&lt;p&gt;In our case, we had IO issues, that made the mongod process locked waiting for IO and making it unresponsive, no queries could be performed to that member, they were hanging because of the IO wait.&lt;br/&gt;
If that happens, the buggy member is not removed from the replicaset, a primary switch doesn&apos;t happen, so the buggy member is still the primary making the whole replicaset non responsive.&lt;br/&gt;
Which is worse, if a replicaset is stuck in this way, the whole cluster is stuck, and all the queries done through mongos hang.&lt;/p&gt;

&lt;p&gt;The heartbeat, according to the documentation, is doing a ping, i&apos;m not sure what kind of ping, but this is not enough to detect a bad member, if the problem is IO (one of the main problems in databases) the ping and even a TCP connection work.&lt;br/&gt;
This heartbeat should do something more sophisticated  like performing a query that reads from disk (important, not from the cached memory)&lt;/p&gt;

&lt;p&gt;And, in this case when a member is completely stuck and unresponsive, probably is worth considering removing it from the replicaset rather than just transitioning it to secondary, because all the replication threads between the primary and the secondary, will hang.&lt;/p&gt;</description>
                <environment></environment>
        <key id="366796">SERVER-28422</key>
            <summary>Cluster stuck because replication heartbeat does not detect hanging members</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="ramon.fernandez@mongodb.com">Ramon Fernandez Marina</assignee>
                                    <reporter username="victorgp">VictorGP</reporter>
                        <labels>
                    </labels>
                <created>Tue, 21 Mar 2017 23:51:47 +0000</created>
                <updated>Wed, 31 May 2017 21:23:34 +0000</updated>
                            <resolved>Wed, 22 Mar 2017 01:05:50 +0000</resolved>
                                    <version>3.2.8</version>
                    <version>3.4.2</version>
                                                    <component>Replication</component>
                    <component>Stability</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="1529960" author="victorgp" created="Wed, 22 Mar 2017 00:21:55 +0000"  >&lt;p&gt;Yes, it looks like the same issue.&lt;/p&gt;

&lt;p&gt;I will comment there&lt;/p&gt;</comment>
                            <comment id="1529956" author="ramon.fernandez" created="Wed, 22 Mar 2017 00:08:31 +0000"  >&lt;p&gt;I believe the root cause is the same one as described in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-14139&quot; title=&quot;Disk failure on one node can (eventually) block a whole cluster&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-14139&quot;&gt;&lt;del&gt;SERVER-14139&lt;/del&gt;&lt;/a&gt;. Unfortunately that ticket hasn&apos;t been resolved because the solution is unclear &amp;#8211; please see &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-14139?focusedCommentId=755250&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-755250&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;this comment&lt;/a&gt; for more information.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="139677">SERVER-14139</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 22 Mar 2017 00:03:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 47 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>backlog-server-pm</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 47 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>victorgp</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht4l3j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsx0yn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;I couldn&apos;t reproduce the same IO issue we experienced because the baremetal setup is complex and i probably require the same hardware, disks, raid controller, etc.&lt;/p&gt;

&lt;p&gt;But i managed to reproduce the exact same symptoms using NFS:&lt;/p&gt;

&lt;p&gt;(setup for Ubuntu)&lt;/p&gt;

&lt;p&gt;1 - Setup a simple NFS server exporting an empty directory: &lt;a href=&quot;https://help.ubuntu.com/community/SettingUpNFSHowTo&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://help.ubuntu.com/community/SettingUpNFSHowTo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2- Install nfs-common and mount an NFS directory:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;mount -t nfs -o noatime,bg,nolock,proto=tcp,port=&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2049&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; &amp;lt;nfs-server-ip&amp;gt;:/ /mongodata1&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;3-  Create a replicaset of 2 members and an arbiter. One of the members, the PRIMARY, will have its storage.dbPath pointing to the NFS directory: /mongodata1&lt;/p&gt;

&lt;p&gt;4- Write data to the replicaset and see everything works as expected, test the primary switchs, etc.&lt;/p&gt;

&lt;p&gt;5- When the member that has the dbPath pointing to the NFS directory is the PRIMARY, in the NFS server stop the NFS daemon:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;service nfs-kernel-server stop&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;6- Keep writing in the replicaset. You will be able to write for some time (probably because it is still using the file system cache), but if you perform a &apos;show dbs&apos; it hangs or after some seconds the writes will hang too.&lt;br/&gt;
The hanging member (PRIMARY) is never transitioned to SECONDARY, so the full replicaset becomes unresponsive.&lt;br/&gt;
The hanging mongod process won&apos;t be stopped with SIGTERM, you will need SIGKILL.&lt;/p&gt;




&lt;p&gt;This, can be extended to a sharded cluster. Create another replicaset, create the shard using mongos, shard a collection, and write data to that collection from mongos.&lt;br/&gt;
Then do the same as before stopping the NFS daemon in the NFS server.&lt;br/&gt;
You will experience the same issue from mongos, after a few seconds, the queries will hang, or if you don&apos;t want to wait just do &apos;show dbs&apos;.&lt;/p&gt;

&lt;p&gt;At this point, the whole cluster is unresponsive.&lt;/p&gt;

&lt;p&gt;Another important thing to note is that, with the IO locking issue we had, once it happened in a secondary member. This also made the whole replicaset stuck, and therefore the whole cluster too. I couldn&apos;t manage to reproduce this with NFS.&lt;/p&gt;


&lt;p&gt;I&apos;ve reproduced this in 3.2.8 and in 3.4.2&lt;/p&gt;</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlge7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>