<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:21:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-9846] Slave replication failure replHandshake unauthorized</title>
                <link>https://jira.mongodb.org/browse/SERVER-9846</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I previously have a database in Mongo 1.8 that was running master/slave replication.&lt;/p&gt;

&lt;p&gt;The data store has three databases: local, admin, and appdb&lt;/p&gt;

&lt;p&gt;I upgraded directly to Mongo 2.4.3, and I am starting to get sporadic master/slave replication failure. Specifically, I am seeing on the slaves:&lt;/p&gt;

&lt;p&gt;Sat Jun  1 14:48:29.702 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Sat Jun  1 14:48:29.703 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt; &lt;br/&gt;
Sat Jun  1 14:48:29.703 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Sat Jun  1 14:48:29.703 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Sat Jun  1 14:48:29.704 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sat Jun  1 14:48:32.710 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sat Jun  1 14:48:33.080 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; replHandshake res not: 0 res: &lt;/p&gt;
{ ok: 0.0, errmsg: &quot;unauthorized&quot; }

&lt;p&gt;And on the master:&lt;/p&gt;

&lt;p&gt;Sat Jun 01 09:49:39.024 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1767&amp;#93;&lt;/span&gt;  authenticate db: local &lt;/p&gt;
{ authenticate: 1, nonce: &quot;XXXXXXXXXX&quot;, user: &quot;repl&quot;, key: &quot;XXXXXXXXX&quot; }
&lt;p&gt;Sat Jun 01 09:49:39.118 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1767&amp;#93;&lt;/span&gt; command denied: &lt;/p&gt;
{ handshake: ObjectId(&apos;XXXXXXXX-changes every time it outputs&apos;) }

&lt;p&gt;The error is not always present. I have three slave instances. One is replicating fine with no error messages. The other two would start replication, but at some point encounter the &quot;unauthorized&quot; errmsg.&lt;/p&gt;

&lt;p&gt;I tried deleting and re-creating the slave database, including adding the &quot;repl&quot; user to the local database. That works for a while, then the error comes up again.&lt;/p&gt;

&lt;p&gt;I tried changing the &quot;repl&quot; user on the master local database, including removing the old &quot;readOnly&quot; field and adding in roles &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;#39;read&amp;#39;, &amp;#39;readWrite&amp;#39;, &amp;#39;dbAdmin&amp;#39;&amp;#93;&lt;/span&gt;, but it still doesn&apos;t work.&lt;/p&gt;</description>
                <environment></environment>
        <key id="77554">SERVER-9846</key>
            <summary>Slave replication failure replHandshake unauthorized</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="spencer@mongodb.com">Spencer Brody</assignee>
                                    <reporter username="ckpeter">Peter Chan</reporter>
                        <labels>
                    </labels>
                <created>Sat, 1 Jun 2013 14:59:55 +0000</created>
                <updated>Wed, 10 Dec 2014 23:10:51 +0000</updated>
                            <resolved>Mon, 10 Jun 2013 17:10:06 +0000</resolved>
                                    <version>2.4.3</version>
                                                    <component>Replication</component>
                    <component>Security</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="356984" author="auto" created="Mon, 10 Jun 2013 18:23:13 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;username&apos;: u&apos;stbrody&apos;, u&apos;name&apos;: u&apos;Spencer T Brody&apos;, u&apos;email&apos;: u&apos;spencer@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-9846&quot; title=&quot;Slave replication failure replHandshake unauthorized&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-9846&quot;&gt;&lt;del&gt;SERVER-9846&lt;/del&gt;&lt;/a&gt; Make master/slave auth test use keyFile&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/64a1136d69b700d9e7e40e7fd8c43f9a8108ac76&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/64a1136d69b700d9e7e40e7fd8c43f9a8108ac76&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="356929" author="ckpeter" created="Mon, 10 Jun 2013 17:23:29 +0000"  >&lt;p&gt;Thank you. Appreciate it.&lt;/p&gt;</comment>
                            <comment id="356921" author="spencer" created="Mon, 10 Jun 2013 17:10:06 +0000"  >&lt;p&gt;To summarize the original master/slave auth issues:&lt;/p&gt;

&lt;p&gt;The old way of enabling access control in a master/slave system, of adding a user named &quot;repl&quot; to the local database of all nodes, no longer is valid as of 2.4.  The new way to enable auth on a master/slave system is to use --keyFile, just as is done for Replica Sets.&lt;/p&gt;</comment>
                            <comment id="356910" author="spencer" created="Mon, 10 Jun 2013 16:59:55 +0000"  >&lt;p&gt;Hi Peter,&lt;br/&gt;
This ticket has diverged a bit from the original problem with authentication.  I have created SUPPORT-593 on your behalf which is a private ticket visible only to you and employees of 10gen so that you can go into further investigation of the network issues you are seeing with members of our support team.&lt;br/&gt;
I&apos;d like to go ahead and resolve this ticket as the original problems with authentication seem to be resolved by the switch to using --keyFile.&lt;br/&gt;
Please let us know if you have any further questions/issues.&lt;/p&gt;</comment>
                            <comment id="355759" author="ckpeter" created="Sat, 8 Jun 2013 04:09:46 +0000"  >&lt;p&gt;Ok, I upgraded the master back to 2.4.3. The problem is now re-occurring on one of the 2.4.3 slaves. It is now producing:&lt;/p&gt;

&lt;p&gt;Fri Jun 07 23:04:56.207 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Fri Jun 07 23:04:56.207 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Fri Jun 07 23:04:56.207 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Fri Jun 07 23:04:59.207 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Fri Jun 07 23:05:30.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Fri Jun 07 23:05:30.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Fri Jun 07 23:05:30.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Fri Jun 07 23:05:30.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Fri Jun 07 23:05:30.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Fri Jun 07 23:05:33.395 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Fri Jun 07 23:06:04.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Fri Jun 07 23:06:04.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Fri Jun 07 23:06:04.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Fri Jun 07 23:06:04.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Fri Jun 07 23:06:04.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Fri Jun 07 23:06:07.660 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;/p&gt;

&lt;p&gt;Another 2.4.3 slave, on the same server, connected to the same SSH tunnel, is replicating fine (note contemporary timestamps):&lt;/p&gt;

&lt;p&gt;Fri Jun 07 23:02:25.520 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1320 operations&lt;br/&gt;
Fri Jun 07 23:02:25.520 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 07 23:02:25 51b2acd1:a&lt;br/&gt;
Fri Jun 07 23:03:26.004 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1410 operations&lt;br/&gt;
Fri Jun 07 23:03:26.004 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 07 23:03:25 51b2ad0d:15&lt;br/&gt;
Fri Jun 07 23:04:27.098 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1440 operations&lt;br/&gt;
Fri Jun 07 23:04:27.098 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 07 23:04:26 51b2ad4a:11&lt;br/&gt;
Fri Jun 07 23:05:28.567 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1440 operations&lt;br/&gt;
Fri Jun 07 23:05:28.567 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 07 23:05:28 51b2ad88:6&lt;br/&gt;
Fri Jun 07 23:06:29.785 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1470 operations&lt;br/&gt;
Fri Jun 07 23:06:29.785 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 07 23:06:29 51b2adc5:9&lt;/p&gt;

&lt;p&gt;I also connected to the localhost end point of the SSH tunnel with a 2.4 shell, and this is the transcript:&lt;/p&gt;

&lt;p&gt;MongoDB shell version: 2.4.3&lt;br/&gt;
connecting to: localhost:37017/test&lt;br/&gt;
&amp;gt; use admin&lt;br/&gt;
switched to db admin&lt;br/&gt;
&amp;gt; db.auth(&apos;username&apos;, &apos;password123&apos;)&lt;br/&gt;
1&lt;br/&gt;
&amp;gt; db.isMaster()&lt;br/&gt;
{&lt;br/&gt;
        &quot;ismaster&quot; : true,&lt;br/&gt;
        &quot;maxBsonObjectSize&quot; : 16777216,&lt;br/&gt;
        &quot;maxMessageSizeBytes&quot; : 48000000,&lt;br/&gt;
        &quot;localTime&quot; : ISODate(&quot;2013-06-08T03:59:40.425Z&quot;),&lt;br/&gt;
        &quot;ok&quot; : 1&lt;br/&gt;
}&lt;br/&gt;
&amp;gt; show dbs&lt;br/&gt;
admin   0.03125GB&lt;br/&gt;
xxxxdb     6.996826171875GB&lt;br/&gt;
local   2.0302734375GB&lt;br/&gt;
&amp;gt;&lt;/p&gt;</comment>
                            <comment id="355432" author="ckpeter" created="Fri, 7 Jun 2013 17:31:45 +0000"  >&lt;p&gt;Ok. This will take me a day or two, as I need to schedule the master offline and upgrade it back to 2.4.3 from 2.2.4.&lt;/p&gt;

&lt;p&gt;Are there any commands you want me to run on the 2.4 shell over the ssh tunnel, if I can establish connection successfully?&lt;/p&gt;

&lt;p&gt;I ask because, from the look of the log, the initial connection is fine, but something subsequently happens with the &quot;DBClientCursor::init call() failed&quot; line.&lt;/p&gt;</comment>
                            <comment id="355335" author="spencer" created="Fri, 7 Jun 2013 15:18:31 +0000"  >&lt;p&gt;Can you try connecting to the master with a 2.4 shell from the slave machine using the same hostname/IP that the slave node is using (I believe that&apos;s localhost:37017)?&lt;/p&gt;</comment>
                            <comment id="354681" author="ckpeter" created="Thu, 6 Jun 2013 18:35:01 +0000"  >&lt;p&gt;Also, if it helps to have more direct access, feel free to contact me offline. I can set something up.&lt;/p&gt;</comment>
                            <comment id="354675" author="ckpeter" created="Thu, 6 Jun 2013 18:26:34 +0000"  >&lt;p&gt;This is the expected setup. The master is listening on port 27017 on MASTER_IP. The slave node has an SSH tunnel that forwards SLAVE_IP:37017 to MASTER_IP:27017, and the slave mongo then connects to the localhost:27017.&lt;/p&gt;

&lt;p&gt;I ruled out problem with the SSH tunnel, because on one occasion, I have two slaves connecting to the same tunnel, one replicating in real time successfully, the other ones has a slave-delay and was failing with the SocketException (I don&apos;t believe the slave delay option is a factor here).&lt;/p&gt;

&lt;p&gt;All arguments and options are EXACTLY the same. I have a db start script that keeps all options the same, and the only difference is I swap between 2.2.4 and 2.4.3 version of bin/mongod.exe to invoke.&lt;/p&gt;

&lt;p&gt;Yes, when I use keyFile with 2.2 nodes, it works fine with no SocketException. As noted above, it appears that this particular SocketException only arises when ALL nodes are running 2.4. If either the master or slave is running 2.2, then everything (so far in the last day) runs fine.&lt;/p&gt;

&lt;p&gt;Currently, I have a 2.2.4 master running, along with 3 x 2.4.3 slaves, using keyFile, and they are replicating fine. Obviously, I want to move the master up to 2.4, assuming we can figure this out.&lt;/p&gt;</comment>
                            <comment id="354618" author="spencer" created="Thu, 6 Jun 2013 17:16:51 +0000"  >&lt;p&gt;From the logs you posted, it looks like your master is listening on port 27017, but your secondary is trying to connect to port 37017.&lt;/p&gt;


&lt;p&gt;When you&apos;ve been downgrading to 2.2, have you been keeping the same startup options?  Were you still using keyFile with 2.2?  Did authentication with keyFile work in 2.2?&lt;/p&gt;</comment>
                            <comment id="354091" author="ckpeter" created="Thu, 6 Jun 2013 00:26:12 +0000"  >&lt;p&gt;So, here are the logs from the slave and master, both running 2.4.3. The logs are captured while only the master and that single failing slave is running. No other drivers / nodes are connecting. Timestamps should be approximately correct.&lt;/p&gt;

&lt;p&gt;Log from slave:&lt;/p&gt;

&lt;p&gt;Wed Jun 05 18:47:50.943 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; db version v2.4.3&lt;br/&gt;
Wed Jun 05 18:47:50.943 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; git version: fe1743177a5ea03e91e0052fb5e2cb2945f6d95f&lt;br/&gt;
Wed Jun 05 18:47:50.943 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack=&apos;Service Pack 1&apos;) BOOST_LIB_VERSION=1_49&lt;br/&gt;
Wed Jun 05 18:47:50.943 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; allocator: system&lt;br/&gt;
Wed Jun 05 18:47:50.943 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; options: &lt;/p&gt;
{ auth: true, autoresync: true, dbpath: &quot;/data/slave-delay&quot;, journal: true, keyFile: &quot;\keyfile.txt&quot;, port: 47018
, slave: true, slavedelay: 43200, smallfiles: true, source: &quot;localhost:37017&quot; }
&lt;p&gt;Wed Jun 05 18:47:50.959 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; journal dir=/data/slave-delay\journal&lt;br/&gt;
Wed Jun 05 18:47:50.959 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; recover : no journal files present, no recovery needed&lt;br/&gt;
Wed Jun 05 18:47:50.990 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; waiting for connections on port 47018&lt;br/&gt;
Wed Jun 05 18:47:50.990 &lt;span class=&quot;error&quot;&gt;&amp;#91;websvr&amp;#93;&lt;/span&gt; admin web console waiting for connections on port 48018&lt;br/&gt;
Wed Jun 05 18:47:51.990 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun 05 18:47:53.006 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Wed Jun 05 18:47:53.006 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun 05 18:47:56.006 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun 05 18:48:26.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Wed Jun 05 18:48:26.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Wed Jun 05 18:48:26.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Wed Jun 05 18:48:26.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Wed Jun 05 18:48:26.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun 05 18:48:29.803 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun 05 18:49:00.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Wed Jun 05 18:49:00.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Wed Jun 05 18:49:00.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Wed Jun 05 18:49:00.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Wed Jun 05 18:49:00.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun 05 18:49:03.553 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun 05 18:49:34.334 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Wed Jun 05 18:49:34.334 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Wed Jun 05 18:49:34.334 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Wed Jun 05 18:49:34.334 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Wed Jun 05 18:49:34.334 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;/p&gt;

&lt;p&gt;Log from master:&lt;/p&gt;

&lt;p&gt;Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; MongoDB starting : pid=4296 port=27017 dbpath=\datadata\ master=1 64-bit host=master&lt;br/&gt;
Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; db version v2.4.3&lt;br/&gt;
Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; git version: fe1743177a5ea03e91e0052fb5e2cb2945f6d95f&lt;br/&gt;
Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack=&apos;Service Pack 1&apos;) BOOST_LIB_VERSION=1_49&lt;br/&gt;
Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; allocator: system&lt;br/&gt;
Wed Jun 05 18:47:37.988 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; options: &lt;/p&gt;
{ auth: true, journal: true, keyFile: &quot;\keyfile.txt&quot;, master: true, oplogSize: 2000, rest: true, smallfiles: true }
&lt;p&gt;Wed Jun 05 18:47:38.003 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; journal dir=\datdata\journal&lt;br/&gt;
Wed Jun 05 18:47:38.003 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; recover : no journal files present, no recovery needed&lt;br/&gt;
Wed Jun 05 18:47:38.050 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; waiting for connections on port 27017&lt;br/&gt;
Wed Jun 05 18:47:38.050 &lt;span class=&quot;error&quot;&gt;&amp;#91;websvr&amp;#93;&lt;/span&gt; admin web console waiting for connections on port 28017&lt;br/&gt;
Wed Jun 05 18:47:56.050 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; connection accepted from MASTER_IP:30930 #1 (1 connection now open)&lt;br/&gt;
Wed Jun 05 18:47:56.191 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1&amp;#93;&lt;/span&gt;  authenticate db: local &lt;/p&gt;
{ authenticate: 1, nonce: &quot;asdfasdfasdfasdf&quot;, user: &quot;__system&quot;, key: &quot;afdfsfadfasdfasdfasdfsdf&quot; }
&lt;p&gt;Wed Jun 05 18:48:29.847 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; connection accepted from MASTER_IP:30932 #2 (2 connections now open)&lt;br/&gt;
Wed Jun 05 18:48:29.956 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn2&amp;#93;&lt;/span&gt;  authenticate db: local &lt;/p&gt;
{ authenticate: 1, nonce: &quot;asdfasdfasdfasdf&quot;, user: &quot;__system&quot;, key: &quot;9d65c92fsfsfdfsdfsdf51518a79fa4a51e&quot; }
&lt;p&gt;Wed Jun 05 18:49:03.597 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; connection accepted from MASTER_IP:30933 #3 (3 connections now open)&lt;br/&gt;
Wed Jun 05 18:49:03.738 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn3&amp;#93;&lt;/span&gt;  authenticate db: local &lt;/p&gt;
{ authenticate: 1, nonce: &quot;asdfasdfasdfasdf&quot;, user: &quot;__system&quot;, key: &quot;f8094c85adfasdfasdfasdfsdfde1adf75&quot; }
&lt;p&gt;Wed Jun 05 18:49:37.800 &lt;span class=&quot;error&quot;&gt;&amp;#91;initandlisten&amp;#93;&lt;/span&gt; connection accepted from MASTER_IP:30935 #4 (4 connections now open)&lt;br/&gt;
Wed Jun 05 18:49:37.941 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn4&amp;#93;&lt;/span&gt;  authenticate db: local &lt;/p&gt;
{ authenticate: 1, nonce: &quot;asdfasdfasdfasdf&quot;, user: &quot;__system&quot;, key: &quot;f2d24asfasdfasdfasdfasdfasdfdsfe1c2fe&quot; }
&lt;p&gt;Wed Jun 05 18:49:38.972 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1&amp;#93;&lt;/span&gt; query local.oplog.$main query: { ts: &lt;/p&gt;
{ $gte: Timestamp 1370420848000|22 }
&lt;p&gt; } cursorid:332362094268386 ntoreturn:0 ntoskip:0 nscanned:102 keyUpdates:0 numYields: 6567 locks(micros) r:3683308 nreturned:101 reslen:11750 102609ms&lt;br/&gt;
Wed Jun 05 18:49:38.972 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1&amp;#93;&lt;/span&gt; end connection MASTER_IP:30930 (3 connections now open)&lt;/p&gt;

&lt;p&gt;Later, I took the chance to downgrade the master to 2.2 while running the slaves at 2.4, and replication still works on all of them, so it appears the problem has to do with the use of keyFile AND the use of 2.4.3 on ALL nodes.&lt;/p&gt;</comment>
                            <comment id="353931" author="ckpeter" created="Wed, 5 Jun 2013 20:50:42 +0000"  >&lt;p&gt;Ok, I will post the console log from master when I get a chance to isolate the display. But at first glance, there is no obvious error on the master log.&lt;/p&gt;

&lt;p&gt;This error stops replication on the affected node. The problem is not visible to application, since this is M/S replication and the master is entirely not affected.&lt;/p&gt;

&lt;p&gt;I have swapped the slaves back to 2.2.4, which immediately resolves the issue. Once I put on 2.4.3, the problem with the SocketException creeps up again, on 2 of 3 nodes.&lt;/p&gt;

&lt;p&gt;It is also sporadic. On my Linux node, it happened on 2.4, I swap to 2.2, it works, I swap back to 2.4, and it still currently and at the moment, replicating (but obviously happened before, and is currently still happening on the Windows node on 2.4).&lt;/p&gt;

&lt;p&gt;It does look like a networking issue, but it seems likely a combination of changes in 2.4 + keyFile, because 1) this particular error didn&apos;t happen before the use of keyFile, 2) on the 2 windows slaves, they both share the same underlying SSH tunnel, but one of them is replicating fine, while the other has this SocketException, 3) use of 2.2.4 does not have this issue across multiple restarts.&lt;/p&gt;</comment>
                            <comment id="353909" author="spencer" created="Wed, 5 Jun 2013 20:26:35 +0000"  >&lt;p&gt;Also, when you see this error, does replication stop?  Are there problems visible to your application, or is this just something you&apos;re seeing in the logs?&lt;/p&gt;</comment>
                            <comment id="353908" author="spencer" created="Wed, 5 Jun 2013 20:24:40 +0000"  >&lt;p&gt;Hard to tell.  At first glance it seems more likely to be a networking issue than an auth issue, but it&apos;s weird that you didn&apos;t see it before changing to use keyFile...&lt;/p&gt;

&lt;p&gt;Can you post logs for the same time period from the primary?&lt;/p&gt;</comment>
                            <comment id="353759" author="ckpeter" created="Wed, 5 Jun 2013 17:38:38 +0000"  >&lt;p&gt;I tried out the keyFile option and it seems to work and replication proceeds, at least initially. But after a few hours, I am getting this cursor message:&lt;/p&gt;

&lt;p&gt;Wed Jun  5 17:09:41.267 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun  5 17:09:41.267 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Wed Jun  5 17:09:41.267 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun  5 17:09:44.275 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun  5 17:10:14.607 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Wed Jun  5 17:10:14.607 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt;&lt;br/&gt;
Wed Jun  5 17:10:14.607 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Wed Jun  5 17:10:14.607 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Wed Jun  5 17:10:14.608 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun  5 17:10:17.615 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Wed Jun  5 17:10:47.994 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; Socket recv() timeout  127.0.0.1:37017&lt;br/&gt;
Wed Jun  5 17:10:47.995 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; SocketException: remote: 127.0.0.1:37017 error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; server &lt;span class=&quot;error&quot;&gt;&amp;#91;127.0.0.1:37017&amp;#93;&lt;/span&gt; &lt;br/&gt;
Wed Jun  5 17:10:47.995 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Wed Jun  5 17:10:47.995 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt;   repl: dbclient::query returns null (conn closed?)&lt;br/&gt;
Wed Jun  5 17:10:47.995 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Wed Jun  5 17:10:51.002 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;/p&gt;

&lt;p&gt;Again, this is running on 2.4.3 master and 2.4.3 slaves. Of the three slaves setup, one is sync&apos;ing properly, but another two has the above error messages (one on Windows and the other on Linux). Restarting the slave nodes has little (or no effect) on resolving the above error.&lt;/p&gt;

&lt;p&gt;This seems to be triggered by the use of keyFile and 2.4.3, as I haven&apos;t seen it before. Is it related?&lt;/p&gt;</comment>
                            <comment id="353734" author="spencer" created="Wed, 5 Jun 2013 17:17:38 +0000"  >&lt;p&gt;Hi Peter,&lt;br/&gt;
I did some further digging, and it turns out that if you use keyFile, you don&apos;t need the &quot;repl&quot; user at all.  The slaves will authenticate to the master using the keyFile, much like they do in a replica set configuration, and the local db&apos;s &quot;repl&quot; user will be ignored.  Switching to --keyFile is a configuration setting, so once you&apos;ve made that change you should be able to leave it alone and from now on your cluster will continue to perform internal authentication using the keyFile.  Client authentication (for example from your app or from the shell) should be unaffected - the keyFile is only used for cluster members to authenticate to each other.&lt;/p&gt;

&lt;p&gt;I filed &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-1571&quot; title=&quot;Update master/slave documentation with auth to recommend keyFile&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-1571&quot;&gt;&lt;del&gt;DOCS-1571&lt;/del&gt;&lt;/a&gt; to update the documentation to reflect this.&lt;/p&gt;</comment>
                            <comment id="353222" author="ckpeter" created="Wed, 5 Jun 2013 02:57:37 +0000"  >&lt;p&gt;Ok, this requires that I take the production database offline, so I need to schedule for it. It will take me another day or two to report back.&lt;/p&gt;

&lt;p&gt;In the meantime, can you give a little background on what this is supposed to do? I know keyFile is for ReplicaSet, but is it also used for Master/slave? Is this a one-time operation or do I need to switch to using keyFile for authentication from now on? How does it relate to the previous &quot;repl&quot; user in the local database?&lt;/p&gt;</comment>
                            <comment id="352965" author="spencer" created="Tue, 4 Jun 2013 21:01:44 +0000"  >&lt;p&gt;Hi Peter,&lt;br/&gt;
Can you please try restarting all your nodes with --keyFile, giving the contents of the keyfile the same password as the &quot;repl&quot; user you set up in the local database?  I believe that should resolve this issue.  More information about setting up keyFile is available here:&lt;br/&gt;
&lt;a href=&quot;http://docs.mongodb.org/manual/core/replication/#replica-set-security&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/core/replication/#replica-set-security&lt;/a&gt;&lt;br/&gt;
and&lt;br/&gt;
&lt;a href=&quot;http://docs.mongodb.org/manual/tutorial/generate-key-file/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/tutorial/generate-key-file/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="351844" author="ckpeter" created="Mon, 3 Jun 2013 19:25:10 +0000"  >&lt;p&gt;Spencer, unfortunately, I am not able to upgrade to ReplicaSets at this point, due to operational requirements.&lt;/p&gt;

&lt;p&gt;Let me know if there is anything I can help with developing a fix / workaround.&lt;/p&gt;</comment>
                            <comment id="351839" author="spencer" created="Mon, 3 Jun 2013 19:17:54 +0000"  >&lt;p&gt;Hi Peter,&lt;br/&gt;
This seems to be a bug with auth in master/slave.  Would it be possible for you to upgrade to using ReplicaSets?  This blog post has some discussion about how to do that upgrade: &lt;a href=&quot;https://wiki.10gen.com/pages/viewpage.action?pageId=29819353#UpgradingtoReplicaSets-UpgradingFromReplicaPairsorMaster%2FSlave&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://wiki.10gen.com/pages/viewpage.action?pageId=29819353#UpgradingtoReplicaSets-UpgradingFromReplicaPairsorMaster%2FSlave&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If using ReplicaSets isn&apos;t an option for you let me know - there may be a workaround to make auth work with master/slave, but it will require some work to confirm and test it.&lt;/p&gt;</comment>
                            <comment id="351387" author="ckpeter" created="Mon, 3 Jun 2013 06:12:55 +0000"  >&lt;p&gt;I managed to, mostly, resolve the operational impact, by downgrading only the slaves to Mongo 2.2.4, and while the master remains at 2.4.3.&lt;/p&gt;

&lt;p&gt;All three slaves instances (running 2.2.4) are now replicating fine, but I still see sporadic &quot;unauthorized&quot; messages. Here is a sample:&lt;/p&gt;

&lt;p&gt;Sun Jun  2 03:14:05 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:14:11 51aab883:e&lt;br/&gt;
Sun Jun  2 03:15:06 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1155 operations&lt;br/&gt;
Sun Jun  2 03:15:06 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:15:05 51aab8b9:5&lt;br/&gt;
Sun Jun  2 03:16:14 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: AssertionException dbclient error communicating with server: localhost:37017&lt;br/&gt;
repl: sleep 2 sec before next pass&lt;br/&gt;
Sun Jun  2 03:16:16 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:16:16 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:16:16 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:16:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:16:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:16:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:16:22 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:16:26 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; replHandshake res not: 0 res: &lt;/p&gt;
{ ok: 0.0, errmsg: &quot;unauthorized&quot; }
&lt;p&gt;Sun Jun  2 03:17:27 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 3031 operations&lt;br/&gt;
Sun Jun  2 03:17:27 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:17:25 51aab945:a&lt;br/&gt;
Sun Jun  2 03:18:32 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: AssertionException dbclient error communicating with server: localhost:37017&lt;br/&gt;
repl: sleep 2 sec before next pass&lt;br/&gt;
Sun Jun  2 03:18:34 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:18:34 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:18:34 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:18:37 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:18:37 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:18:37 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:18:40 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:18:43 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; replHandshake res not: 0 res: &lt;/p&gt;
{ ok: 0.0, errmsg: &quot;unauthorized&quot; }
&lt;p&gt;Sun Jun  2 03:19:44 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 2885 operations&lt;br/&gt;
Sun Jun  2 03:19:44 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:19:42 51aab9ce:5&lt;br/&gt;
Sun Jun  2 03:20:45 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1320 operations&lt;br/&gt;
Sun Jun  2 03:20:45 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:20:44 51aaba0c:9&lt;br/&gt;
Sun Jun  2 03:21:46 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1305 operations&lt;br/&gt;
Sun Jun  2 03:21:46 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun  2 03:21:44 51aaba48:f&lt;br/&gt;
Sun Jun  2 03:22:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: AssertionException dbclient error communicating with server: localhost:37017&lt;br/&gt;
repl: sleep 2 sec before next pass&lt;br/&gt;
Sun Jun  2 03:22:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:22:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:22:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:23:00 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:23:00 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: couldn&apos;t connect to server localhost:37017&lt;br/&gt;
Sun Jun  2 03:23:00 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: sleep 3 sec before next pass&lt;br/&gt;
Sun Jun  2 03:23:03 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sun Jun  2 03:23:06 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; replHandshake res not: 0 res: &lt;/p&gt;
{ ok: 0.0, errmsg: &quot;unauthorized&quot; }

&lt;p&gt;It appears that the unauthorized message comes up at the beginning of the connection establishment, but the slave can continue after the message. No idea why 2.2.4 slaves can continue, while same slaves running 2.4.3 has only partial success.&lt;/p&gt;

&lt;p&gt;Obviously this is not a long-term workaround, as eventually I want all mongod to be running 2.4+, not 2.2.&lt;/p&gt;</comment>
                            <comment id="350614" author="ckpeter" created="Sat, 1 Jun 2013 15:14:40 +0000"  >&lt;p&gt;Just to add some information:&lt;/p&gt;

&lt;p&gt;The error seems to come up sporadically. On the one good instance, it comes up once upon startup, then replication continues successfully:&lt;/p&gt;

&lt;p&gt;Sat Jun 01 10:08:54.310 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl: syncing from host:localhost:37017&lt;br/&gt;
Sat Jun 01 10:08:54.529 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; replHandshake res not: 0 res: &lt;/p&gt;
{ ok: 0.0, errmsg:
&quot;unauthorized&quot; }
&lt;p&gt;Sat Jun 01 10:09:55.951 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 2880 operations&lt;br/&gt;
Sat Jun 01 10:09:55.951 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 01 10:09:55 51aa0ec3:a&lt;br/&gt;
Sat Jun 01 10:10:56.920 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1320 operations&lt;br/&gt;
Sat Jun 01 10:10:56.920 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   syncedTo: Jun 01 10:10:56 51aa0f00:d&lt;br/&gt;
Sat Jun 01 10:11:57.013 &lt;span class=&quot;error&quot;&gt;&amp;#91;replslave&amp;#93;&lt;/span&gt; repl:   checkpoint applied 1305 operations&lt;br/&gt;
...&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="78001">DOCS-1571</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="78005">SERVER-9865</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>22.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 3 Jun 2013 19:17:54 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        10 years, 36 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            10 years, 36 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>ckpeter</customfieldvalue>
            <customfieldvalue>spencer@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrmr8f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrqx1j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>70871</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10166" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Tests Written</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10154"><![CDATA[Complete]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsvnt3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>