<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:57:45 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-41451] Re-starting a secondary database in a replica set generates NetworkInterfaceExceededTimeLimit errors</title>
                <link>https://jira.mongodb.org/browse/SERVER-41451</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We have a replica set and the secondary and primary are in sync and all is good.&lt;/p&gt;

&lt;p&gt;After 7pm each evening we stop the secondary database instance, backup the server and then start it again. The secondary database instance then eventually catches up. This has worked for the last few months without any issues.&lt;/p&gt;

&lt;p&gt;Now, when the secondary database instance is restarted we get network timeout messages over and over again and the secondary gets further and further behind.&lt;/p&gt;

&lt;p&gt;We have no idea what causes these timeouts or how the timeout can be increased.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;^mongo log 28-05-2019.txt&amp;#93;&lt;/span&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="787067">SERVER-41451</key>
            <summary>Re-starting a secondary database in a replica set generates NetworkInterfaceExceededTimeLimit errors</summary>
                <type id="6" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14720&amp;avatarType=issuetype">Question</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13204">Community Answered</resolution>
                                        <assignee username="dmitry.agranat@mongodb.com">Dmitry Agranat</assignee>
                                    <reporter username="ihannah@meniscus.co.uk">Ian Hannah</reporter>
                        <labels>
                    </labels>
                <created>Sun, 2 Jun 2019 19:08:13 +0000</created>
                <updated>Fri, 27 Oct 2023 15:56:39 +0000</updated>
                            <resolved>Mon, 24 Feb 2020 09:59:46 +0000</resolved>
                                    <version>3.6.8</version>
                                                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="2986547" author="ihannah@meniscus.co.uk" created="Thu, 19 Mar 2020 09:12:22 +0000"  >&lt;p&gt;@here We have done some further extensive tests and we ONLY get the network errors once Mongo is lagging behind so this is clearly a Mongo issue. Once it gets behind then it cannot keep up.&lt;/p&gt;</comment>
                            <comment id="2970152" author="ihannah@meniscus.co.uk" created="Thu, 12 Mar 2020 09:17:56 +0000"  >&lt;p&gt;How do you know that these network errors are not caused by Mongo. How can I diagnose what Mongo is trying to do? What is the current timeout and can this be increased?&lt;/p&gt;

&lt;p&gt;You says that this may indicate inefficient replication settings. A small oplog window would not cause network errors would it?&lt;/p&gt;

&lt;p&gt;You made a suggestion with the read concern that you thought would fix it so what was the reason behind this suggestion?&lt;/p&gt;</comment>
                            <comment id="2965727" author="dmitry.agranat" created="Wed, 11 Mar 2020 14:01:10 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Based on your last comment:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When we restart the server (after backup) most of the time the logs show these network errors. Occasionally when we restart the server we do not these network errors and the secondary works correctly and catches up. When we get the network errors it does not catch up&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;These inconsistent and sporadic network issues might indicate some low level network issues. Alternatively, this might indicate inefficient replication settings, for example, oplog window is too small due to the generation of a lot of oplog data per hour.&lt;/p&gt;

&lt;p&gt;The SERVER project is for bugs and feature suggestions for the MongoDB server. As this ticket does not appear to be a bug, I will now close it. If you need further assistance troubleshooting, I encourage you to ask our community by posting on the &lt;a href=&quot;http://community.mongodb.comhttps://groups.google.com/group/mongodb-user&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;MongoDB Community Forumsmongodb-user group&lt;/a&gt; or on &lt;a href=&quot;https://stackoverflow.com/questions/tagged/mongodb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Stack Overflow with the &lt;tt&gt;mongodb&lt;/tt&gt; tag&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Dima&lt;/p&gt;</comment>
                            <comment id="2954505" author="ihannah@meniscus.co.uk" created="Mon, 9 Mar 2020 16:20:14 +0000"  >&lt;p&gt;@here I am still waiting for some response to my last message.&#160;enableMajorityReadConcern is already false and we get these errors intermittently it seems.&lt;/p&gt;</comment>
                            <comment id="2918503" author="ihannah@meniscus.co.uk" created="Fri, 28 Feb 2020 21:09:44 +0000"  >&lt;p&gt;Some further information that may help. When we restart the server (after backup) most of the time the logs show these network errors. Occasionally when we restart the server we do not these network errors and the secondary works correctly and catches up. When we get the network errors it does not catch up.&lt;/p&gt;

&lt;p&gt;Is there anything special that needs to be done when starting the server to prevent these network errors?&lt;/p&gt;</comment>
                            <comment id="2913883" author="ihannah@meniscus.co.uk" created="Thu, 27 Feb 2020 10:50:09 +0000"  >&lt;p&gt;As a result of these errors the secondary is getting further and further behind and cannot catch up. Then the oplog is too stale.&lt;/p&gt;</comment>
                            <comment id="2913868" author="ihannah@meniscus.co.uk" created="Thu, 27 Feb 2020 10:26:32 +0000"  >&lt;p&gt;We are constantly getting this error in the log:&lt;/p&gt;

&lt;p&gt;2020-02-27T09:27:14.631+0000 I ASIO &lt;span class=&quot;error&quot;&gt;&amp;#91;NetworkInterfaceASIO-RS-0&amp;#93;&lt;/span&gt; Ending connection to host 192.168.45.241:27057 due to bad connection status; 1 connections to that host remain open&lt;br/&gt;
2020-02-27T09:27:14.631+0000 I REPL &lt;span class=&quot;error&quot;&gt;&amp;#91;replication-997&amp;#93;&lt;/span&gt; Error returned from oplog query (no more query restarts left): NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out&lt;br/&gt;
2020-02-27T09:27:14.635+0000 W REPL &lt;span class=&quot;error&quot;&gt;&amp;#91;rsBackgroundSync&amp;#93;&lt;/span&gt; Fetcher stopped querying remote oplog with error: NetworkInterfaceExceededTimeLimit: error in fetcher batch callback: Operation timed out&lt;br/&gt;
2020-02-27T09:27:14.636+0000 I REPL &lt;span class=&quot;error&quot;&gt;&amp;#91;rsBackgroundSync&amp;#93;&lt;/span&gt; Clearing sync source 192.168.45.241:27057 to choose a new one.&lt;br/&gt;
2020-02-27T09:27:14.638+0000 I REPL &lt;span class=&quot;error&quot;&gt;&amp;#91;rsBackgroundSync&amp;#93;&lt;/span&gt; sync source candidate: 192.168.45.241:27057&lt;br/&gt;
2020-02-27T09:27:15.773+0000 I REPL &lt;span class=&quot;error&quot;&gt;&amp;#91;rsBackgroundSync&amp;#93;&lt;/span&gt; Chose same sync source candidate as last time, 192.168.45.241:27057. Sleeping for 1 second to avoid immediately choosing a new sync source for the same reason as last time.&lt;br/&gt;
2020-02-27T09:27:16.776+0000 I ASIO &lt;span class=&quot;error&quot;&gt;&amp;#91;NetworkInterfaceASIO-RS-0&amp;#93;&lt;/span&gt; Connecting to 192.168.45.241:27057&lt;br/&gt;
2020-02-27T09:27:16.794+0000 I ASIO &lt;span class=&quot;error&quot;&gt;&amp;#91;NetworkInterfaceASIO-RS-0&amp;#93;&lt;/span&gt; Successfully connected to 192.168.45.241:27057, took 18ms (2 connections now open to 192.168.45.241:27057)&lt;/p&gt;</comment>
                            <comment id="2913854" author="ihannah@meniscus.co.uk" created="Thu, 27 Feb 2020 10:14:06 +0000"  >&lt;p&gt;We cannot have a PSS architecture - we do not have the capacity so we have to have a PSA architecture.&lt;/p&gt;

&lt;p&gt;This is the configuration file from one of the servers:&lt;/p&gt;

&lt;p&gt;systemLog:&lt;br/&gt;
&#160; &#160; destination: file&lt;br/&gt;
&#160; &#160; path: D:\MongoData\Log\mongod.log&lt;br/&gt;
storage:&lt;br/&gt;
&#160; &#160; engine: wiredTiger&lt;br/&gt;
&#160; &#160; wiredTiger:&lt;br/&gt;
&#160; &#160; &#160; &#160; engineConfig:&lt;br/&gt;
&#160; &#160; &#160; &#160; &#160; &#160; &#160;cacheSizeGB: 10&lt;br/&gt;
&#160; &#160; dbPath: D:\MongoData\DB&lt;br/&gt;
replication:&lt;br/&gt;
&#160; &#160; oplogSizeMB: 50000&lt;br/&gt;
&#160; &#160; replSetName: MAP&lt;br/&gt;
&#160; &#160; enableMajorityReadConcern: false&lt;br/&gt;
net:&lt;br/&gt;
&#160; &#160; bindIp: 127.0.0.1,192.168.45.239&lt;br/&gt;
&#160; &#160; port: 27057&lt;br/&gt;
security:&lt;br/&gt;
&#160; &#160; keyFile: C:\Program Files\MongoDB\Server\3.6\bin\keyfile&lt;br/&gt;
&#160; &#160; authorization: enabled&lt;/p&gt;

&lt;p&gt;So we already have&#160;enableMajorityReadConcern false already and we still have the same issue.&lt;/p&gt;

&lt;p&gt;Can you please advise on what we should do?&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="2903892" author="dmitry.agranat" created="Mon, 24 Feb 2020 09:57:28 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, &lt;/p&gt;

&lt;p&gt;We do not recommend a PSA architecture under the default read concern majority configuration. You can either:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Replace the Arbiter with a Secondary member (PSA --&amp;gt; PSS) or&lt;/li&gt;
	&lt;li&gt;Disable read concern &quot;&lt;a href=&quot;https://docs.mongodb.com/manual/reference/read-concern-majority/index.html#readconcern.%22majority%22&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;majority&lt;/a&gt;&quot;. However, this has implications for change streams (in MongoDB 4.0 and earlier only) and transactions on sharded clusters. For more information, see &lt;a href=&quot;https://docs.mongodb.com/manual/reference/read-concern-majority/index.html#disable-read-concern-majority&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Disable Read Concern Majority&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I will go ahead and close this ticket but if you still experiencing issues after implementing either of the above recommendations, please reopen and upload the following data via the provided secure uploader:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;archive of &lt;tt&gt;diagnostic.data&lt;/tt&gt; from all members of the replica set&lt;/li&gt;
	&lt;li&gt;compressed mongod logs from all members of the replica set&lt;/li&gt;
	&lt;li&gt;The exact time and timezone of the issue&lt;/li&gt;
	&lt;li&gt;The output of &lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#dbcmd.replSetGetStatus&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;replSetGetStatus&lt;/a&gt;, for example:
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;db.adminCommand( { replSetGetStatus : 1 } )&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="2886677" author="ihannah@meniscus.co.uk" created="Thu, 20 Feb 2020 15:35:47 +0000"  >&lt;p&gt;When I run this:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#rs-status-output&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;replSetGetStatus&lt;/a&gt;&#160;&lt;/p&gt;

&lt;p&gt;I get method is not defined. Is that the correct command?&lt;/p&gt;

&lt;p&gt;We a primary, secondary and arbiter in our configuration.&lt;/p&gt;</comment>
                            <comment id="2871611" author="dmitry.agranat" created="Wed, 12 Feb 2020 09:30:18 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, &lt;/p&gt;

&lt;p&gt;I had a look at the provided data. There are a few issues with the configuration of this deployment as well as unclear members state during the reported event. For example, I can see this the log:&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2020-02-04T01:09:13.049+0000 I REPL     [replexec-1] Member [&amp;lt;Real IP was redacted here]1 is now in state PRIMARY&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2020-02-04T01:09:13.050+0000 I REPL     [replexec-0] Member [&amp;lt;Real IP was redacted here]0 is now in state ARBITER&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;During the time you stopped your secondary to perform a backup, the other member of the replica set was also down (so 2 out of 3 members are down).&lt;/p&gt;

&lt;p&gt;Could you please clarify your replica set deployment and configuration by providing the output of &lt;a href=&quot;https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#rs-status-output&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;replSetGetStatus&lt;/a&gt; command?&lt;/p&gt;</comment>
                            <comment id="2803816" author="ihannah@meniscus.co.uk" created="Thu, 6 Feb 2020 08:14:30 +0000"  >&lt;p&gt;@Daniel Hatcher. The logs are uploaded. The secondary did eventually catch up but it took many hours. I believe that the network errors shown in the logs are causing the lag issues as this is what we saw before. We will perform the process again while you are looking into it.&lt;/p&gt;</comment>
                            <comment id="2782689" author="daniel.hatcher" created="Tue, 4 Feb 2020 16:00:56 +0000"  >&lt;p&gt;JIRA can sometimes be wonky with the @ system so no worries.&lt;/p&gt;

&lt;p&gt;Can you upload all the files to our &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/8faa6328-de9b-43db-a343-f5587c73f6ba.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Secure Uploader&lt;/a&gt;? Only MongoDB engineers will be able to access the contents.&lt;/p&gt;</comment>
                            <comment id="2782113" author="ihannah@meniscus.co.uk" created="Tue, 4 Feb 2020 10:03:47 +0000"  >&lt;p&gt;Daniel Hatcher - I cannot work out how to insert you name in a comment. @ does not seem to work!&lt;/p&gt;</comment>
                            <comment id="2782104" author="ihannah@meniscus.co.uk" created="Tue, 4 Feb 2020 10:00:46 +0000"  >&lt;p&gt;&lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;mailto:daniel.hatcher@mongodb.com&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;daniel.hatcher@mongodb.com&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.mongodb.org/images/icons/mail_small.gif&quot; height=&quot;12&quot; width=&quot;13&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;&#160;This has now happened again. We have a replica set using version 3.6.13 of Mongo and last night we backed up the secondary and it was down for 2 hours. Now we are getting network errors in the logs and the secondary is not keeping up.&#160;&lt;/p&gt;

&lt;p&gt;I have the logs and diagnostic data from both servers but the zipped file for the Primary is quite large. Let me know where you want me to put the log files.&lt;/p&gt;</comment>
                            <comment id="2471614" author="daniel.hatcher" created="Tue, 8 Oct 2019 14:54:41 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, because we&apos;re not actively investigating this bug, I&apos;m going to close it. However, please just leave a follow up comment if you get a chance to test a new version with replication and we can easily re-open the ticket and continue investigating.&lt;/p&gt;</comment>
                            <comment id="2471143" author="ihannah@meniscus.co.uk" created="Tue, 8 Oct 2019 08:47:09 +0000"  >&lt;p&gt;Hi Daniel,&lt;/p&gt;

&lt;p&gt;We are using version 3.6.13 but we have not setup replication again because we are currently looking into more pressing issues &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.mongodb.org/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;Can you please keep this ticket open? I am hoping that we can setup replication again shortly.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2444438" author="daniel.hatcher" created="Tue, 1 Oct 2019 22:10:05 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, have you been able to check a later version of 3.6?&lt;/p&gt;</comment>
                            <comment id="2408389" author="ihannah@meniscus.co.uk" created="Thu, 5 Sep 2019 07:58:07 +0000"  >&lt;p&gt;We have tried 3.6.13 in isolation but not in a replica set - we have had other issues to resolve.&lt;/p&gt;

&lt;p&gt;I am hoping that we can try this over the next week or two and then I will get back to you.&lt;/p&gt;

&lt;p&gt;Please keep this ticket open for the time being.&lt;/p&gt;</comment>
                            <comment id="2407665" author="daniel.hatcher" created="Wed, 4 Sep 2019 18:44:25 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt; have you had a chance to test 3.6.13?&lt;/p&gt;</comment>
                            <comment id="2350017" author="ihannah@meniscus.co.uk" created="Tue, 30 Jul 2019 12:33:24 +0000"  >&lt;p&gt;Hi Daniel,&lt;/p&gt;

&lt;p&gt;The test system does not&#160;have replication configured.&lt;/p&gt;

&lt;p&gt;We&#160;have upgraded Mongo on the test system and it seems to&#160;have not caused any issues so I am&#160;going to install on live tonight and then configure replication next week.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2339872" author="daniel.hatcher" created="Tue, 23 Jul 2019 20:43:33 +0000"  >&lt;p&gt;I can think of reasons why the network timeouts would occur but none of them match the scenario of everything working fine until the restart and then broken afterwards. I&apos;m hoping that 3.6.13 will either fix the problem or give us some better diagnostics to troubleshoot it.&lt;/p&gt;

&lt;p&gt;You mentioned that you have a live system and a test system. Have you ever seen this problem on the test system or is it only on the live system? Have you been able to reproduce the problem in any environment other than the one in which its occurring? Maybe there&apos;s an issue with the underlying infrastructure.&lt;/p&gt;</comment>
                            <comment id="2335869" author="ihannah@meniscus.co.uk" created="Mon, 22 Jul 2019 08:27:14 +0000"  >&lt;p&gt;Hi Daniel,&lt;/p&gt;

&lt;p&gt;1. Nothing has changed hardware wise. As I mentioned the replication works well when we have copied the db over to the secondary and then run the system. These issues come into play when the secondary has been shut down for a while. Why would we suddenly get network timeouts when the secondary comes back online when it has been working fine up until this point? What would cause network timeouts? I am confused why network timeouts occur when replication restarts.&lt;/p&gt;

&lt;p&gt;2. It will take me a bit of time to get 3.6.13 on the live system. I will have to install it on the test system first and then onto live so this might take a week or two.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2334355" author="daniel.hatcher" created="Fri, 19 Jul 2019 16:14:49 +0000"  >&lt;p&gt;Hello Ian,&lt;/p&gt;

&lt;p&gt;Thanks for uploading the diagnostics from the Secondary. Importantly it tells us that the Secondary is replicating after it restarts. However, due to the network timeouts, the replication does not proceed fast enough so the node eventually reaches a stale state.&lt;/p&gt;

&lt;p&gt;Based on this information, I have a few follow-up questions.&lt;br/&gt;
1. You mentioned that this process was working cleanly until relatively recently. Can you think of any changes in the process, the workload, or underlying hardware that could have precipitated this change?&lt;br/&gt;
2. Would it be possible to upgrade the replica set to 3.6.13? There were performance improvements made and I am interested to see if this is reproducible in the latest version of 3.6.&lt;/p&gt;</comment>
                            <comment id="2329900" author="ihannah@meniscus.co.uk" created="Wed, 17 Jul 2019 08:29:36 +0000"  >&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;Apologies for that - I am not sure what happened there but I have uploaded the correct diagnostics for the secondary now.&lt;/p&gt;

&lt;p&gt;So when we start replication we copy the main db to the secondary db so that all is in sync to start with. We have replication running for a few days and the secondary is no more than 1-2 seconds behind.&lt;/p&gt;

&lt;p&gt;Then we turn off the secondary to back it up and then a couple of hours later we bring it online again. This is when the secondary never catches up and the network issues appear.&lt;/p&gt;</comment>
                            <comment id="2329189" author="eric.sedor" created="Tue, 16 Jul 2019 17:43:22 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, I am sorry if I have not been clear: Inability for a Secondary to catch up after being down for multiple hours is not necessarily the result of a bug.&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;What I was saying in my last message that the latest file in &lt;span class=&quot;error&quot;&gt;&amp;#91;^diagnostics data - secondary 090719.zip&amp;#93;&lt;/span&gt; is metrics.2019-05-29T19-26-22Z-00000, so we don&apos;t currently have Secondary diagnostic data for the 9/7 test. Can you try adding that data again?&lt;/li&gt;
	&lt;li&gt;When you say &quot;replication is not working&quot; are you saying you have not been able to add the secondary back to the set at all?&lt;/li&gt;
	&lt;li&gt;If so, have you tried an initial sync?&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="2326086" author="ihannah@meniscus.co.uk" created="Mon, 15 Jul 2019 06:18:40 +0000"  >&lt;p&gt;Hi Joseph/Eric,&lt;/p&gt;

&lt;p&gt;Have there been any developments on this? I am very keen to resolve this issue as currently we do not have replication working.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2320581" author="ihannah@meniscus.co.uk" created="Thu, 11 Jul 2019 07:26:42 +0000"  >&lt;p&gt;Hi Joseph,&lt;/p&gt;

&lt;p&gt;The latest files that I have attached have 09072019 in the name.&lt;/p&gt;

&lt;p&gt;The other two files were from the previous dump when I raised this ticket.&lt;/p&gt;

&lt;p&gt;So ignore the May ones.&lt;/p&gt;

&lt;p&gt;Does that make sense?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2320011" author="eric.sedor" created="Wed, 10 Jul 2019 19:11:57 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Can you double-check and provide the diagnostic data for the secondary node? It looks like both &lt;span class=&quot;error&quot;&gt;&amp;#91;^diagnostics data - secondary.zip&amp;#93;&lt;/span&gt; and &lt;span class=&quot;error&quot;&gt;&amp;#91;^diagnostics data - secondary 090719.zip&amp;#93;&lt;/span&gt; only contain data from 5/25 to 5/29&lt;/p&gt;</comment>
                            <comment id="2316626" author="ihannah@meniscus.co.uk" created="Tue, 9 Jul 2019 11:46:38 +0000"  >&lt;p&gt;Hi Eric,&lt;/p&gt;

&lt;p&gt;Hopefully you have everything that you need. Please let me know if you need anything else.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2316606" author="ihannah@meniscus.co.uk" created="Tue, 9 Jul 2019 11:25:53 +0000"  >&lt;p&gt;Hi Eric,&lt;/p&gt;

&lt;p&gt;The same thing happened. Replication was all working (secondary was 0-1 second behind). We shutdown the secondary last night for 2 hours, started it up again and it is lagging more and more.&lt;/p&gt;

&lt;p&gt;I have attached the log files for you.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2307086" author="ihannah@meniscus.co.uk" created="Mon, 1 Jul 2019 16:20:28 +0000"  >&lt;p&gt;Hi Eric,&lt;/p&gt;

&lt;p&gt;We setup replication last week and it is all working. It started going wrong last time when we backed up the secondary. We are going to try this within the next day or two so I will keep you posted.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2305510" author="eric.sedor" created="Fri, 28 Jun 2019 21:51:08 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;, I wanted to follow up to see if you&apos;ve experienced additional incidents or if you&apos;ve had a chance to perform another test.&lt;/p&gt;

&lt;p&gt;Eric&lt;/p&gt;</comment>
                            <comment id="2280834" author="ihannah@meniscus.co.uk" created="Wed, 12 Jun 2019 10:37:43 +0000"  >&lt;p&gt;Hi Eric,&lt;/p&gt;

&lt;p&gt;We will need to configure the replication again to get the logs.&lt;/p&gt;

&lt;p&gt;Unfortunately my colleague is away this week but hopefully we can do this early next week and then I can get you the logs.&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Ian&lt;/p&gt;</comment>
                            <comment id="2280166" author="eric.sedor" created="Tue, 11 Jun 2019 19:54:08 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;We can confirm what you are seeing in terms of the Secondary getting further and further behind. When this happens it&apos;s possible that the queries the Secondary issues to the Primary for replication can time out. This is not necessarily a bug and could be happening because the Secondary had been stopped for too long.&lt;/p&gt;

&lt;p&gt;To determine if this failure to catch up is the result of a bug, we do need matching diagnostic data and log files for both the Primary and the Secondary for an incident. You can submit these to this &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/e928cbbf-5f3d-46f8-958d-67e4d465c571.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;secure upload portal&lt;/a&gt;. Files uploaded here are only visible to MongoDB employees and will be deleted after some time.&lt;/p&gt;

&lt;p&gt;Thanks in advance!&lt;/p&gt;</comment>
                            <comment id="2277519" author="ihannah@meniscus.co.uk" created="Mon, 10 Jun 2019 13:11:56 +0000"  >&lt;p&gt;Hi Eric,&lt;/p&gt;

&lt;p&gt;I have only managed to get the logs for the secondary - the primary no longer has information going back to the end of May.&lt;/p&gt;

&lt;p&gt;Let me know if you can see anything in these logs. If necessary we&apos;ll have to go through the whole process again.&lt;/p&gt;</comment>
                            <comment id="2274288" author="eric.sedor" created="Thu, 6 Jun 2019 17:50:34 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ihannah%40meniscus.co.uk&quot; class=&quot;user-hover&quot; rel=&quot;ihannah@meniscus.co.uk&quot;&gt;ihannah@meniscus.co.uk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;For both the Primary and the Secondary, can you please archive (tar or zip) the &lt;tt&gt;$dbpath/diagnostic.data&lt;/tt&gt; directory (the contents are described &lt;a href=&quot;https://docs.mongodb.com/manual/administration/analyzing-mongodb-performance/#full-time-diagnostic-data-capture&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;) and attach it to this ticket?&lt;/p&gt;

&lt;p&gt;Thanks in advance!&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="224433" name="diagnostics data - secondary 090719 V2.zip" size="138389994" author="ihannah@meniscus.co.uk" created="Wed, 17 Jul 2019 08:22:03 +0000"/>
                            <attachment id="223605" name="mongod - primary - 090719.log.zip" size="46975865" author="ihannah@meniscus.co.uk" created="Tue, 9 Jul 2019 11:37:51 +0000"/>
                            <attachment id="223603" name="mongod - secondary - 090719.log.zip" size="1007948" author="ihannah@meniscus.co.uk" created="Tue, 9 Jul 2019 11:36:32 +0000"/>
                            <attachment id="223606" name="primary-metrics-09072019.zip" size="125485383" author="ihannah@meniscus.co.uk" created="Tue, 9 Jul 2019 11:42:57 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>37.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 3 Jun 2019 00:38:42 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 years, 46 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 years, 46 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>daniel.hatcher@mongodb.com</customfieldvalue>
            <customfieldvalue>dmitry.agranat@mongodb.com</customfieldvalue>
            <customfieldvalue>eric.sedor@mongodb.com</customfieldvalue>
            <customfieldvalue>ihannah@meniscus.co.uk</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hv2oq7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hurwon:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[eric.sedor@mongodb.com]]></customfieldvalue>
        <customfieldvalue><![CDATA[dmitry.agranat@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hv2azj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>