<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:21:59 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-29854] Secondary members no more syncing until we restart it (oplog blocked)</title>
                <link>https://jira.mongodb.org/browse/SERVER-29854</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;Since we upgraded our cluster from 3.2.8 to 3.4.3, regularly secondary members of our most loaded shard stopping syncing to primary. When that happens, the field &quot;oplog last event time&quot; returned by command rs.printReplicationInfo() stays indefinitely blocked. The secondary member just fall behind and we need to restart it in order to free the oplog, allowing to catch up the primary member.&lt;/p&gt;

&lt;p&gt;When it happens, just before the service shutdown, in logs we can see this error :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-11T05:13:46.410+0200 I REPL     [replication-7923] Error returned from oplog query while canceling query: ExceededTimeLimit: Remote command timed out while waiting to get a connection from the pool, took 39149258ms, timeout was set to 65000ms&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-11T05:13:46.411+0200 I REPL     [rsBackgroundSync] Replication producer stopped after oplog fetcher finished returning a batch from our sync source.  Abandoning this batch of oplog entries and re-evaluating our sync source.&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;The &quot;took 39149258ms, timeout was set to 65000ms&quot; is strange, seems that the oplog query  was longer that the timeout. The shard on witch it happens receives huge amount of read and writes so it&apos;s not shocking for us to see a oplog query to fail. But the secondary should be able to retry it.&lt;/p&gt;

&lt;p&gt;And sometimes when the oplog is blocked and we restarting the concerned member, it produces theses messages in logs :&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.914+0200 I CONTROL  [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.914+0200 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.914+0200 I NETWORK  [signalProcessingThread] closing listening socket: 7&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.915+0200 I NETWORK  [signalProcessingThread] shutdown: going to flush diaglog...&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.915+0200 I REPL     [signalProcessingThread] shutting down replication subsystems&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:32.916+0200 I REPL     [signalProcessingThread] Stopping replication reporter thread&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.051+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.30:27025 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.052+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.3:27025 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.052+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.28:27025 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.052+0200 W NETWORK  [ReplicaSetMonitor-TaskExecutor-0] No primary detected for set csReplSet&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.052+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] All nodes for set csReplSet are down. This has happened for 1 checks in a row.&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.329+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.18:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.329+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.16:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.329+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.17:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.329+0200 W NETWORK  [ReplicaSetMonitor-TaskExecutor-0] No primary detected for set sfx6&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.329+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] All nodes for set sfx6 are down. This has happened for 1 checks in a row.&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.839+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.30:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.839+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.29:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.839+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] Marking host 172.16.18.28:27017 as failed :: caused by :: Location17382: Can&apos;t use connection pool during shutdown&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.839+0200 W NETWORK  [ReplicaSetMonitor-TaskExecutor-0] No primary detected for set sfx10&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;2017-06-25T17:35:57.839+0200 I NETWORK  [ReplicaSetMonitor-TaskExecutor-0] All nodes for set sfx10 are down. This has happened for 1 checks in a row.&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;[...]&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;In that case only a &quot;kill -9&quot; let the member to shutdown.&lt;/p&gt;


&lt;p&gt;Our Mongo cluster details :&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Many shards + config replica set, each formed by 3 members (1 primary + 2 secondary)&lt;/li&gt;
	&lt;li&gt;2 mongos&lt;/li&gt;
	&lt;li&gt;Balancer is disabled&lt;/li&gt;
	&lt;li&gt;Package version 3.4.4, OS: Debian 8 Jessie&lt;/li&gt;
	&lt;li&gt;Servers: 6 cores Xeon CPU, 64GB RAM, ~3To SSD, ext4 file system&lt;/li&gt;
	&lt;li&gt;~ 40 collections in 1 DB&lt;/li&gt;
	&lt;li&gt;Many writes and reads&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Thank you in advance for your help.&lt;/p&gt;

&lt;p&gt;Best regards,&lt;br/&gt;
Slawomir&lt;/p&gt;</description>
                <environment></environment>
        <key id="398185">SERVER-29854</key>
            <summary>Secondary members no more syncing until we restart it (oplog blocked)</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="mark.agarunov">Mark Agarunov</assignee>
                                    <reporter username="slluk-sa">Slawomir Lukiewski</reporter>
                        <labels>
                            <label>MAREF</label>
                            <label>SWNA</label>
                    </labels>
                <created>Mon, 26 Jun 2017 16:21:53 +0000</created>
                <updated>Thu, 7 Dec 2017 21:26:34 +0000</updated>
                            <resolved>Tue, 7 Nov 2017 19:31:39 +0000</resolved>
                                    <version>3.4.3</version>
                    <version>3.4.4</version>
                                                    <component>Replication</component>
                                        <votes>2</votes>
                                    <watches>12</watches>
                                                                                                                <comments>
                            <comment id="1719648" author="mark.agarunov" created="Tue, 7 Nov 2017 19:31:21 +0000"  >&lt;p&gt;Hello &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=slluk-sa&quot; class=&quot;user-hover&quot; rel=&quot;slluk-sa&quot;&gt;slluk-sa&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for the additional information. After investigating the provided data, this appears to be due to a network issue, where slow communication between &lt;tt&gt;mongod&lt;/tt&gt; nodes will keep locks from yielding and cause the behavior you&apos;re seeing. We&apos;ve opened &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-31368&quot; title=&quot;Log time spent waiting for other shards in merge cursors aggregation stage&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-31368&quot;&gt;&lt;del&gt;SERVER-31368&lt;/del&gt;&lt;/a&gt; for additional logging to be added to better diagnose this issue in the future.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Mark&lt;/p&gt;</comment>
                            <comment id="1607391" author="slluk-sa" created="Tue, 27 Jun 2017 10:37:19 +0000"  >&lt;p&gt;Hi Thomas,&lt;/p&gt;

&lt;p&gt;Thank you for your answer.&lt;br/&gt;
For each of the 3 members of the concerned shard / replica set, I uploaded 2 files :&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;An archive of diagnostic.data folder (seems to contain data from June 20).&lt;/li&gt;
	&lt;li&gt;The mongod log file from June 18 to June 26.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;You should look at events of &lt;b&gt;June 23&lt;/b&gt; at night, on member with name ending with &lt;b&gt;24&lt;/b&gt;. That day we got the problem twice (the second time with failed mongod shutdown).&lt;br/&gt;
Chronology :&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;02:29 : &quot;oplog last event time&quot; blocked (based on our monitoring)&lt;/li&gt;
	&lt;li&gt;05:08 : mongod restart (2017-06-23T05:08:15.647+0200 in log), then started to catch up primary&lt;/li&gt;
	&lt;li&gt;06:00 : &quot;oplog last event time&quot; blocked again&lt;/li&gt;
	&lt;li&gt;07:27 : mongod restart try (2017-06-23T07:27:08.761+0200 in log), mongod has not stopped, lots of &quot;No primary detected for...&quot; errors in log&lt;/li&gt;
	&lt;li&gt;07:33 : kill -9 mongod, then start, oplog catch up primary...&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Best regards,&lt;br/&gt;
Slawomir&lt;/p&gt;</comment>
                            <comment id="1606809" author="thomas.schubert" created="Mon, 26 Jun 2017 18:33:36 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=slluk-sa&quot; class=&quot;user-hover&quot; rel=&quot;slluk-sa&quot;&gt;slluk-sa&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thanks for reporting this issue. We&apos;ll need some additional information to better understand what is going on here. To help us investigate, would you please provide the complete mongod log files covering this issue as well as an archive of the diagnostic.data for the three nodes in the affected replica set?&lt;/p&gt;

&lt;p&gt;I&apos;ve created a secure &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/22069fde-9704-48af-aa46-519e71cea647.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;upload portal&lt;/a&gt; for you to provide these files. Files uploaded to the portal are visible only to MongoDB employees investigating the issue and are routinely deleted after some time.&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="439486">SERVER-31368</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="166927" name="ftdc2.png" size="57038" author="mark.agarunov" created="Mon, 2 Oct 2017 19:57:51 +0000"/>
                            <attachment id="167060" name="stuck.png" size="323529" author="bruce.lucas@mongodb.com" created="Tue, 3 Oct 2017 16:06:49 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 26 Jun 2017 18:33:36 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 14 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>backlog-server-pm</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 14 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
            <customfieldvalue>mark.agarunov</customfieldvalue>
            <customfieldvalue>slluk-sa</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht9x1z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|ht1vqn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht9j4f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>