<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:01:56 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-3049] com.mongodb.MongoException: setShardVersion failed host[mongodb04.example.com:27018] { errmsg: &quot;not master&quot;, ok: 0.0 }</title>
                <link>https://jira.mongodb.org/browse/SERVER-3049</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;After replica set failover, I get the following exception in my apps:&lt;/p&gt;

&lt;p&gt;com.mongodb.MongoException: setShardVersion failed host&lt;span class=&quot;error&quot;&gt;&amp;#91;mongodb04.example.com:27018&amp;#93;&lt;/span&gt; &lt;/p&gt;
{ errmsg: &quot;not master&quot;, ok: 0.0 }

&lt;p&gt;Strangely, I do not get this message for every request. It seems to happen about 50% of the time. The other 50% of the requests succeed without errors. I have seen this go on for several minutes (e.g. 15-20). The only way I can resolve it is by failing back to the original master, or by restarting the mongos on the appserver. Reproducing this problem is very simple:&lt;/p&gt;

&lt;p&gt;1.) Run &quot;watch -n 1 curl -v &lt;a href=&quot;http://example.com/something/that/queries/mongodb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://example.com/something/that/queries/mongodb&lt;/a&gt;&quot; on the appserver&lt;br/&gt;
2.) Run &quot;rs.stepDown()&quot; on a MongoDB master&lt;br/&gt;
3.) Watch your curl command intermittently fail (seemingly) forever&lt;br/&gt;
4.) Run &quot;rs.stepDown()&quot; on the new MongoDB master (fail back to original master)&lt;br/&gt;
5.) Watch your curl command succeed&lt;/p&gt;

&lt;p&gt;Additionally after failover, I see several messages like this in my mongos.log (on the order of 30-40 per second):&lt;/p&gt;

&lt;p&gt;Thu May  5 16:34:43 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn78&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }

&lt;p&gt;These go away as soon as I fail back to the original master. I don&apos;t know if this is related to the same issue, so I created another ticket for this: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-3040&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;https://jira.mongodb.org/browse/SERVER-3040&lt;/a&gt;&lt;/p&gt;</description>
                <environment>Ubuntu LInux on EC2</environment>
        <key id="16673">SERVER-3049</key>
            <summary>com.mongodb.MongoException: setShardVersion failed host[mongodb04.example.com:27018] { errmsg: &quot;not master&quot;, ok: 0.0 }</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="greg_10gen">Greg Studer</assignee>
                                    <reporter username="mconigliaro">Michael Conigliaro</reporter>
                        <labels>
                            <label>failed</label>
                            <label>master</label>
                            <label>not</label>
                            <label>setShardVersion</label>
                    </labels>
                <created>Thu, 5 May 2011 16:42:36 +0000</created>
                <updated>Tue, 12 Jul 2016 00:18:14 +0000</updated>
                            <resolved>Thu, 12 May 2011 20:48:09 +0000</resolved>
                                    <version>1.8.1</version>
                                    <fixVersion>1.8.3</fixVersion>
                    <fixVersion>1.9.1</fixVersion>
                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                    <workratio workratioPercent="0"/>
                                                                    <timeoriginalestimate seconds="0">0 minutes</timeoriginalestimate>
                            <timeestimate seconds="0">0 minutes</timeestimate>
                                        <comments>
                            <comment id="41335" author="greg_10gen" created="Fri, 8 Jul 2011 14:53:33 +0000"  >&lt;p&gt;The fix for this will be in 1.8.3, and should be in the 1.8 nightlies now.&lt;/p&gt;</comment>
                            <comment id="41259" author="raylu" created="Fri, 8 Jul 2011 04:08:14 +0000"  >&lt;p&gt;So has this been backported? If I downloaded mongodb from the downloads page, does it include the fix?&lt;/p&gt;</comment>
                            <comment id="32375" author="greg_10gen" created="Thu, 12 May 2011 20:48:09 +0000"  >&lt;p&gt;a fix which we&apos;re backporting to 1.8.2 should solve this problem - pretty sure it&apos;s a bad combination of old random slave choice behavior with an edge case that allows duplicate replica set nodes...&lt;/p&gt;</comment>
                            <comment id="31718" author="mconigliaro" created="Mon, 9 May 2011 21:07:41 +0000"  >&lt;p&gt;I was finally able to fail back to the original master. Here&apos;s what that looked like. That insane amount of logging also stopped:&lt;/p&gt;

&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn23&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn23&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; WriteBackListener exception : DBClientBase::findOne: transport error: mongodb05.example.com:27018 query: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef31dc02d61ad0bcce3&apos;) }
&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;MessagingPort recv() errno:104 Connection reset by peer 10.120.14.79:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; warning: Could not get last error.DBClientBase::findOne: transport error: mongodb05.example.com:27018 query: &lt;/p&gt;
{ getlasterror: 1 }
&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; MessagingPort recv() errno:104 Connection reset by peer 10.120.14.79:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: caught exception mongodb05.example.com:27018 DBClientBase::findOne: transport error: mongodb05.example.com:27018 query: &lt;/p&gt;
{ ismaster: 1 }
&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: caught exception mongodb04.example.com:27018 DBClientBase::findOne: transport error: mongodb04.example.com:27018 query: &lt;/p&gt;
{ ismaster: 1 }
&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn21&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: caught exception mongodb04.example.com:27018 DBClientBase::findOne: transport error: mongodb04.example.com:27018 query: &lt;/p&gt;
{ ismaster: 1 }
&lt;p&gt;Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; _check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; trying reconnect to mongodb05.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; reconnect mongodb05.example.com:27018 ok&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: false, secondary: true, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;trying reconnect to mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; reconnect mongodb04.example.com:27018 ok&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb04.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: false, secondary: true, hosts: [ &quot;mongodb04.example.com:27018&quot;, &quot;mongodb05.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], primary: &quot;mongodb05.example.com:27018&quot;, maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;trying reconnect to mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:55 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:55 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn18&amp;#93;&lt;/span&gt; reconnect mongodb04.example.com:27018 ok&lt;br/&gt;
Mon May  9 21:02:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; MessagingPort recv() errno:104 Connection reset by peer 10.120.14.79:27018&lt;br/&gt;
Mon May  9 21:02:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; SocketException: remote:  error: 9001 socket exception &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;br/&gt;
Mon May  9 21:02:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; DBClientCursor::init call() failed&lt;br/&gt;
Mon May  9 21:02:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; WriteBackListener exception : DBClientBase::findOne: transport error: mongodb05.example.com:27018 query: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef31dc02d61ad0bcce3&apos;) }
&lt;p&gt;Mon May  9 21:02:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn23&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb05.example.com:27018&quot;, &quot;mongodb04.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;_check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018&lt;br/&gt;
Mon May  9 21:02:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;LockPinger&amp;#93;&lt;/span&gt; dist_lock pinged successfully for: appserver11:1304719091:1804289383&lt;br/&gt;
Mon May  9 21:02:57 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:57 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:57 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:58 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; creating new connection to:mongodb05.example.com:27018&lt;br/&gt;
Mon May  9 21:02:58 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:02:58 BackgroundJob starting: ConnectBG&lt;br/&gt;
Mon May  9 21:03:14 &lt;span class=&quot;error&quot;&gt;&amp;#91;ReplicaSetMonitorWatcher&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb10.example.com:27018 &lt;/p&gt;
{ setName: &quot;4&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb10.example.com:27018&quot;, &quot;mongodb11.example.com:27018&quot; ], arbiters: [ &quot;mongodb12.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;checking replica set: 1&lt;br/&gt;
Mon May  9 21:03:14 &lt;span class=&quot;error&quot;&gt;&amp;#91;ReplicaSetMonitorWatcher&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb01.example.com:27018 &lt;/p&gt;
{ setName: &quot;1&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb01.example.com:27018&quot;, &quot;mongodb02.example.com:27018&quot; ], arbiters: [ &quot;mongodb03.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;checking replica set: 2&lt;br/&gt;
Mon May  9 21:03:14 &lt;span class=&quot;error&quot;&gt;&amp;#91;ReplicaSetMonitorWatcher&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb04.example.com:27018 &lt;/p&gt;
{ setName: &quot;2&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb04.example.com:27018&quot;, &quot;mongodb05.example.com:27018&quot; ], arbiters: [ &quot;mongodb06.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;checking replica set: 3&lt;br/&gt;
Mon May  9 21:03:14 &lt;span class=&quot;error&quot;&gt;&amp;#91;ReplicaSetMonitorWatcher&amp;#93;&lt;/span&gt; ReplicaSetMonitor::_checkConnection: mongodb07.example.com:27018 &lt;/p&gt;
{ setName: &quot;3&quot;, ismaster: true, secondary: false, hosts: [ &quot;mongodb07.example.com:27018&quot;, &quot;mongodb08.example.com:27018&quot; ], arbiters: [ &quot;mongodb09.example.com:27018&quot; ], maxBsonObjectSize: 16777216, ok: 1.0 }
&lt;p&gt;checking replica set: 4&lt;br/&gt;
Mon May  9 21:03:16 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }
&lt;p&gt;Mon May  9 21:03:17 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }
&lt;p&gt;Mon May  9 21:03:18 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }
&lt;p&gt;Mon May  9 21:03:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }
&lt;p&gt;Mon May  9 21:03:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }
&lt;p&gt;Mon May  9 21:03:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;WriteBackListener&amp;#93;&lt;/span&gt; writebacklisten result: &lt;/p&gt;
{ noop: true, ok: 1.0 }</comment>
                            <comment id="31701" author="greg_10gen" created="Mon, 9 May 2011 20:41:55 +0000"  >&lt;p&gt;i&apos;m investigating that duplicate now - if you have a chance, can you send additional mongos logs for the period before and after that duplicate host is added?&lt;/p&gt;</comment>
                            <comment id="31698" author="mconigliaro" created="Mon, 9 May 2011 20:35:45 +0000"  >&lt;p&gt;No chunks should have migrated. I actually have that disabled at the moment, since we experienced performance problems with it in the past.&lt;/p&gt;

&lt;p&gt;I&apos;ve actually never noticed that duplicate host in there. When I look back in the logs, the only time I see that message with the duplicate host is immediately after a failover. I just restarted the mongos on one machine, and the message looked like this instead: &lt;/p&gt;

&lt;p&gt;Mon May  9 20:33:18 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn1&amp;#93;&lt;/span&gt; _check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018&lt;/p&gt;</comment>
                            <comment id="31692" author="greg_10gen" created="Mon, 9 May 2011 20:21:44 +0000"  >&lt;p&gt;has a chunk recently migrated to/from that rs?  There&apos;s an issue where processing a migration (all at once) can put a lot of load on the slave.&lt;/p&gt;

&lt;p&gt;Also, this line of your mongos logs - &lt;br/&gt;
Mon May  9 18:58:56 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn960&amp;#93;&lt;/span&gt; _check : 2/mongodb05.example.com:27018,mongodb04.example.com:27018,mongodb04.example.com:27018 - does the duplicate mongodb04 always appear?  Or does it start appearing after the stepdown?&lt;/p&gt;</comment>
                            <comment id="31677" author="mconigliaro" created="Mon, 9 May 2011 19:51:51 +0000"  >&lt;p&gt;Additionally, I see that the replica set slave delay is slowly climbing (1688 seconds and counting), so I can&apos;t fail back to the original master, because it&apos;s too far behind. There are no errors in the mongodb.log on either the master or slave. Here&apos;s en example of what I&apos;m looking at. Everything seems normal to me.&lt;/p&gt;

&lt;p&gt;Mon May  9 19:48:22 &lt;span class=&quot;error&quot;&gt;&amp;#91;dur&amp;#93;&lt;/span&gt; lsn set 213147014&lt;br/&gt;
Mon May  9 19:48:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43108&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef36c891bb10035067e&apos;) }
&lt;p&gt; reslen:60 300006ms&lt;br/&gt;
Mon May  9 19:48:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43107&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef21cdc26a42769eebf&apos;) }
&lt;p&gt; reslen:60 300008ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43125&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef3849f0a1c88287d78&apos;) }
&lt;p&gt; reslen:60 300005ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43122&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef3b3948f87df7f48ac&apos;) }
&lt;p&gt; reslen:60 300008ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43124&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef3eaeb10781fbe5b24&apos;) }
&lt;p&gt; reslen:60 300006ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43120&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef35a579354135266ba&apos;) }
&lt;p&gt; reslen:60 300003ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43121&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef3a8567aa60347bcd4&apos;) }
&lt;p&gt; reslen:60 300006ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43126&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef3944c9431f24cc4d0&apos;) }
&lt;p&gt; reslen:60 300005ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43128&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef32c217a86054ca7f2&apos;) }
&lt;p&gt; reslen:60 300005ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43123&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef380f90945682bc23b&apos;) }
&lt;p&gt; reslen:60 300004ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43127&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef31dc02d61ad0bcce3&apos;) }
&lt;p&gt; reslen:60 300007ms&lt;br/&gt;
Mon May  9 19:48:59 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43129&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ writebacklisten: ObjectId(&apos;4dc46ef386adf40920898eb0&apos;) }
&lt;p&gt; reslen:60 300005ms&lt;br/&gt;
Mon May  9 19:49:16 &lt;span class=&quot;error&quot;&gt;&amp;#91;dur&amp;#93;&lt;/span&gt; lsn set 213178119&lt;br/&gt;
Mon May  9 19:50:17 &lt;span class=&quot;error&quot;&gt;&amp;#91;dur&amp;#93;&lt;/span&gt; lsn set 213209004&lt;br/&gt;
Mon May  9 19:50:17 &lt;span class=&quot;error&quot;&gt;&amp;#91;LockPinger&amp;#93;&lt;/span&gt; dist_lock pinged successfully for: mongodb04:1304633912:16024625&lt;/p&gt;

&lt;p&gt;EDIT: I take it back. There are some errors in the slave about a cusror now. I&apos;ve seen these before, but I don&apos;t know if they mean anything:&lt;/p&gt;

&lt;p&gt;Mon May  9 19:50:37 &lt;span class=&quot;error&quot;&gt;&amp;#91;replica set sync&amp;#93;&lt;/span&gt;  example.userEventsJournal warning: cursor loc null does not match byLoc position a:711ba56c !&lt;br/&gt;
Mon May  9 19:51:19 &lt;span class=&quot;error&quot;&gt;&amp;#91;dur&amp;#93;&lt;/span&gt; lsn set 213240040&lt;br/&gt;
Mon May  9 19:51:27 &lt;span class=&quot;error&quot;&gt;&amp;#91;replica set sync&amp;#93;&lt;/span&gt;  example.userEventsJournal warning: cursor loc null does not match byLoc position 7:6bc64614 !&lt;br/&gt;
Mon May  9 19:51:32 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn43467&amp;#93;&lt;/span&gt; query admin.$cmd ntoreturn:1 command: &lt;/p&gt;
{ serverStatus: 1.0 }
&lt;p&gt; reslen:1643 101ms&lt;br/&gt;
Mon May  9 19:51:40 &lt;span class=&quot;error&quot;&gt;&amp;#91;replica set sync&amp;#93;&lt;/span&gt;  example.userEventsJournal warning: cursor loc null does not match byLoc position 7:2a4c2130 !&lt;br/&gt;
Mon May  9 19:51:48 &lt;span class=&quot;error&quot;&gt;&amp;#91;replica set sync&amp;#93;&lt;/span&gt;  example.userEventsJournal warning: cursor loc null does not match byLoc position 9:b76d398 !&lt;br/&gt;
Mon May  9 19:52:28 &lt;span class=&quot;error&quot;&gt;&amp;#91;dur&amp;#93;&lt;/span&gt; lsn set 213271095&lt;br/&gt;
Mon May  9 19:52:57 &lt;span class=&quot;error&quot;&gt;&amp;#91;replica set sync&amp;#93;&lt;/span&gt;  example.userEventsJournal warning: cursor loc null does not match byLoc position 8:27fe053c !&lt;/p&gt;

&lt;p&gt;EDIT: Just for the record, here&apos;s the output of rs.status(). Nothing wrong there that I can see...&lt;/p&gt;

&lt;p&gt;2:PRIMARY&amp;gt; rs.status()              &lt;br/&gt;
{&lt;br/&gt;
	&quot;set&quot; : &quot;2&quot;,&lt;br/&gt;
	&quot;date&quot; : ISODate(&quot;2011-05-09T19:56:29Z&quot;),&lt;br/&gt;
	&quot;myState&quot; : 1,&lt;br/&gt;
	&quot;members&quot; : [&lt;br/&gt;
		{&lt;br/&gt;
			&quot;_id&quot; : 0,&lt;br/&gt;
			&quot;name&quot; : &quot;mongodb04.example.com:27018&quot;,&lt;br/&gt;
			&quot;health&quot; : 1,&lt;br/&gt;
			&quot;state&quot; : 2,&lt;br/&gt;
			&quot;stateStr&quot; : &quot;SECONDARY&quot;,&lt;br/&gt;
			&quot;uptime&quot; : 873,&lt;br/&gt;
			&quot;optime&quot; : &lt;/p&gt;
{
				&quot;t&quot; : 1304968941000,
				&quot;i&quot; : 520
			}
&lt;p&gt;,&lt;br/&gt;
			&quot;optimeDate&quot; : ISODate(&quot;2011-05-09T19:22:21Z&quot;),&lt;br/&gt;
			&quot;lastHeartbeat&quot; : ISODate(&quot;2011-05-09T19:56:28Z&quot;)&lt;br/&gt;
		},&lt;br/&gt;
		{&lt;br/&gt;
			&quot;_id&quot; : 1,&lt;br/&gt;
			&quot;name&quot; : &quot;mongodb05.example.com:27018&quot;,&lt;br/&gt;
			&quot;health&quot; : 1,&lt;br/&gt;
			&quot;state&quot; : 1,&lt;br/&gt;
			&quot;stateStr&quot; : &quot;PRIMARY&quot;,&lt;br/&gt;
			&quot;optime&quot; : &lt;/p&gt;
{
				&quot;t&quot; : 1304970988000,
				&quot;i&quot; : 2
			}
&lt;p&gt;,&lt;br/&gt;
			&quot;optimeDate&quot; : ISODate(&quot;2011-05-09T19:56:28Z&quot;),&lt;br/&gt;
			&quot;self&quot; : true&lt;br/&gt;
		},&lt;br/&gt;
		{&lt;br/&gt;
			&quot;_id&quot; : 2,&lt;br/&gt;
			&quot;name&quot; : &quot;mongodb06.example.com:27018&quot;,&lt;br/&gt;
			&quot;health&quot; : 1,&lt;br/&gt;
			&quot;state&quot; : 7,&lt;br/&gt;
			&quot;stateStr&quot; : &quot;ARBITER&quot;,&lt;br/&gt;
			&quot;uptime&quot; : 873,&lt;br/&gt;
			&quot;optime&quot; : &lt;/p&gt;
{
				&quot;t&quot; : 0,
				&quot;i&quot; : 0
			}
&lt;p&gt;,&lt;br/&gt;
			&quot;optimeDate&quot; : ISODate(&quot;1970-01-01T00:00:00Z&quot;),&lt;br/&gt;
			&quot;lastHeartbeat&quot; : ISODate(&quot;2011-05-09T19:56:28Z&quot;)&lt;br/&gt;
		}&lt;br/&gt;
	],&lt;br/&gt;
	&quot;ok&quot; : 1&lt;br/&gt;
}&lt;/p&gt;

&lt;p&gt;EDIT: Shard status looks OK to me too...&lt;/p&gt;

&lt;p&gt;&amp;gt; db.printShardingStatus()&lt;br/&gt;
&amp;#8212; Sharding Status &amp;#8212; &lt;br/&gt;
  sharding version: &lt;/p&gt;
{ &quot;_id&quot; : 1, &quot;version&quot; : 3 }
&lt;p&gt;  shards:&lt;br/&gt;
      {&lt;br/&gt;
	&quot;_id&quot; : &quot;2&quot;,&lt;br/&gt;
	&quot;host&quot; : &quot;2/mongodb05.example.com:27018,mongodb04.example.com:27018&quot;&lt;br/&gt;
}&lt;br/&gt;
      {&lt;br/&gt;
	&quot;_id&quot; : &quot;1&quot;,&lt;br/&gt;
	&quot;host&quot; : &quot;1/mongodb01.example.com:27018,mongodb02.example.com:27018&quot;&lt;br/&gt;
}&lt;br/&gt;
      {&lt;br/&gt;
	&quot;_id&quot; : &quot;3&quot;,&lt;br/&gt;
	&quot;host&quot; : &quot;3/mongodb07.example.com:27018,mongodb08.example.com:27018&quot;&lt;br/&gt;
}&lt;br/&gt;
      {&lt;br/&gt;
	&quot;_id&quot; : &quot;4&quot;,&lt;br/&gt;
	&quot;host&quot; : &quot;4/mongodb10.example.com:27018,mongodb11.example.com:27018&quot;&lt;br/&gt;
}&lt;br/&gt;
  databases:&lt;/p&gt;
	{ &quot;_id&quot; : &quot;admin&quot;, &quot;partitioned&quot; : false, &quot;primary&quot; : &quot;config&quot; }
	{ &quot;_id&quot; : &quot;test&quot;, &quot;partitioned&quot; : false, &quot;primary&quot; : &quot;2&quot; }
	{ &quot;_id&quot; : &quot;example&quot;, &quot;partitioned&quot; : true, &quot;primary&quot; : &quot;2&quot; }
&lt;p&gt;		example.stats chunks:&lt;br/&gt;
				4	22&lt;br/&gt;
				3	24&lt;br/&gt;
				1	22&lt;br/&gt;
				2	23&lt;br/&gt;
			too many chunksn to print, use verbose if you want to force print&lt;br/&gt;
		example.userEventsJournal chunks:&lt;br/&gt;
				4	194&lt;br/&gt;
				1	202&lt;br/&gt;
				3	189&lt;br/&gt;
				2	172&lt;br/&gt;
			too many chunksn to print, use verbose if you want to force print&lt;br/&gt;
		example.userfile chunks:&lt;br/&gt;
				4	2&lt;br/&gt;
				2	5&lt;br/&gt;
			{ &quot;userId&quot; : &lt;/p&gt;
{ $minKey : 1 }
&lt;p&gt; } --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;000009a6-02e5-4277-90a3-fb38264440f9&quot; }
&lt;p&gt; on : 4 &lt;/p&gt;
{ &quot;t&quot; : 7000, &quot;i&quot; : 0 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;000009a6-02e5-4277-90a3-fb38264440f9&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;3295759f-14d4-46b3-bcac-22e6b8c35400&quot; }
&lt;p&gt; on : 2 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 2 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;3295759f-14d4-46b3-bcac-22e6b8c35400&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;651a8cb3-7df8-43bf-bbee-6c58a1f88b28&quot; }
&lt;p&gt; on : 2 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 4 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;651a8cb3-7df8-43bf-bbee-6c58a1f88b28&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;97cf4d36-c30d-4888-bd26-6a3ef8dcb662&quot; }
&lt;p&gt; on : 2 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 6 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;97cf4d36-c30d-4888-bd26-6a3ef8dcb662&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;ca921ed8-67bb-44e7-a802-8c48a8ae8c8d&quot; }
&lt;p&gt; on : 2 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 8 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;ca921ed8-67bb-44e7-a802-8c48a8ae8c8d&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; &lt;/p&gt;
{ &quot;userId&quot; : &quot;testuser2&quot; }
&lt;p&gt; on : 2 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 9 }
&lt;p&gt;			&lt;/p&gt;
{ &quot;userId&quot; : &quot;testuser2&quot; }
&lt;p&gt; --&amp;gt;&amp;gt; { &quot;userId&quot; : &lt;/p&gt;
{ $maxKey : 1 }
&lt;p&gt; } on : 4 &lt;/p&gt;
{ &quot;t&quot; : 8000, &quot;i&quot; : 0 }
	{ &quot;_id&quot; : &quot;local&quot;, &quot;partitioned&quot; : false, &quot;primary&quot; : &quot;2&quot; }


&lt;p&gt;EDIT: Looks like I&apos;m IO-bound on the slave, so that would explain why it can&apos;t catch up. Though I have no idea what the slave is doing to cause all that IO...&lt;/p&gt;

&lt;p&gt;Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util&lt;br/&gt;
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00&lt;br/&gt;
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00&lt;br/&gt;
sdh               0.00   284.08  162.19  329.35  4187.06  7657.71    24.10    19.39   38.89   2.02  99.50&lt;/p&gt;</comment>
                            <comment id="31666" author="mconigliaro" created="Mon, 9 May 2011 19:32:28 +0000"  >&lt;p&gt;This is appearing in some of my other logs. That little piece of a stack trace looks scary:&lt;/p&gt;

&lt;p&gt;Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt; going to retry checkShardVersion&lt;br/&gt;
Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt; DBConfig unserialize: example &lt;/p&gt;
{ _id: &quot;example&quot;, partitioned: true, primary: &quot;2&quot; }
&lt;p&gt;Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt;     setShardVersion  2 mongodb04.example.com:27018  example.userfile  &lt;/p&gt;
{ setShardVersion: &quot;example.userfile&quot;, configdb: &quot;mongodb-config03.example.com:27019,mongodb-config02.example.co...&quot;, version: Timestamp 8000|9, serverID: ObjectId(&apos;4dc46ef31dc02d61ad0bcce3&apos;), authoritative: true, shard: &quot;2&quot;, shardHost: &quot;2/mongodb05.example.com:27018,mongodb04.example.com:27018&quot; }
&lt;p&gt; 0x7fbfa0002750&lt;br/&gt;
Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt;        setShardVersion failed!&lt;/p&gt;
{ errmsg: &quot;not master&quot;, ok: 0.0 }
&lt;p&gt;Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt;      setShardVersion failed host&lt;span class=&quot;error&quot;&gt;&amp;#91;mongodb04.example.com:27018&amp;#93;&lt;/span&gt; &lt;/p&gt;
{ errmsg: &quot;not master&quot;, ok: 0.0 }
&lt;p&gt;Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt; Assertion: 10429:setShardVersion failed host&lt;span class=&quot;error&quot;&gt;&amp;#91;mongodb04.example.com:27018&amp;#93;&lt;/span&gt; &lt;/p&gt;
{ errmsg: &quot;not master&quot;, ok: 0.0 }
&lt;p&gt;0x51fc59 0x69ca53 0x69c5e2 &lt;br/&gt;
 /usr/bin/mongos(_ZN5mongo11msgassertedEiPKc+0x129) &lt;span class=&quot;error&quot;&gt;&amp;#91;0x51fc59&amp;#93;&lt;/span&gt;&lt;br/&gt;
 /usr/bin/mongos() &lt;span class=&quot;error&quot;&gt;&amp;#91;0x69ca53&amp;#93;&lt;/span&gt;&lt;br/&gt;
 /usr/bin/mongos() &lt;span class=&quot;error&quot;&gt;&amp;#91;0x69c5e2&amp;#93;&lt;/span&gt;&lt;br/&gt;
Mon May  9 19:24:21 &lt;span class=&quot;error&quot;&gt;&amp;#91;conn22&amp;#93;&lt;/span&gt; AssertionException in process: setShardVersion failed host&lt;span class=&quot;error&quot;&gt;&amp;#91;mongodb04.example.com:27018&amp;#93;&lt;/span&gt; &lt;/p&gt;
{ errmsg: &quot;not master&quot;, ok: 0.0 }</comment>
                            <comment id="31664" author="mconigliaro" created="Mon, 9 May 2011 19:29:28 +0000"  >&lt;p&gt;OK, I reproduced the problem. The log file is ridiculous due to this problem: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-3040&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;https://jira.mongodb.org/browse/SERVER-3040&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="31415" author="greg_10gen" created="Fri, 6 May 2011 22:34:52 +0000"  >&lt;p&gt;been there &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.mongodb.org/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;.  no worries, but yes, definitely send us the logs when/if you see the behavior again - one thing to look out for may be an unresponsive member of the set after stepdown when your queries fail intermittently&lt;/p&gt;</comment>
                            <comment id="31414" author="mconigliaro" created="Fri, 6 May 2011 22:21:10 +0000"  >&lt;p&gt;And of course, now I can&apos;t reproduce the problem, even though I could reproduce it easily yesterday afternoon. The only thing that&apos;s changed is that I truncated all the mongos logs before restarting all the mongos processes. I wonder if they need to be running for a long time before I can get them stuck in that state. Anyway, I&apos;ll keep trying...&lt;/p&gt;</comment>
                            <comment id="31403" author="mconigliaro" created="Fri, 6 May 2011 20:34:45 +0000"  >&lt;p&gt;Yes, this is a web app with a RESTful interface that communicates with a local mongos instance that points to 4 replica set shards. We&apos;re using a Sinatra-like framework called BlueEyes for Scala which was actually developed in-house, so there may be issues with the way we&apos;re using the MongoDB driver (I know I&apos;ve found some issues with what they were doing in the past). Here&apos;s a link to the source for reference:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/jdegoes/blueeyes/tree/master/src/main/scala/blueeyes/persistence/mongo&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/jdegoes/blueeyes/tree/master/src/main/scala/blueeyes/persistence/mongo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I&apos;ll try reproducing this again with more log output...&lt;/p&gt;</comment>
                            <comment id="31381" author="greg_10gen" created="Fri, 6 May 2011 17:23:03 +0000"  >&lt;p&gt;can&apos;t reproduce here, it would be really helpful to see the full logs from the time period slightly before you stepDown() until you run the second stepDown() to restore the master.&lt;/p&gt;

&lt;p&gt;also, is your app doing writes or reads to the db, and is slaveOk set?  I assume your setup is an app connected to a mongos instance with replica set shards...  &lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="11828" name="appserver11.log" size="15895539" author="mconigliaro" created="Mon, 9 May 2011 21:05:16 +0000"/>
                            <attachment id="11829" name="appserver12.log" size="22193321" author="mconigliaro" created="Mon, 9 May 2011 21:18:13 +0000"/>
                            <attachment id="11827" name="failover_to_slave.log" size="3161998" author="mconigliaro" created="Mon, 9 May 2011 19:29:28 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 6 May 2011 17:23:03 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        12 years, 32 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            12 years, 32 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10020"><![CDATA[Linux]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>greg_10gen</customfieldvalue>
            <customfieldvalue>mconigliaro</customfieldvalue>
            <customfieldvalue>raylu</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrp0br:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrif6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>21113</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht0erz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>