<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 08:52:23 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[JAVA-481] Driver not retrying on Connection timed out SocketException</title>
                <link>https://jira.mongodb.org/browse/JAVA-481</link>
                <project id="10006" key="JAVA">Java Driver</project>
                    <description>&lt;p&gt;I&apos;ve got a MongoDB replica set across two datacenters. In my second data center I have some servers that point back to the primary instance in data center 1.&lt;/p&gt;

&lt;p&gt;I ran into a connection timeout issue (this happens pretty consistently) on the server, here is the stack trace:&lt;/p&gt;

&lt;p&gt;com.mongodb.DBPortPool gotError&lt;br/&gt;
WARNING: emptying DBPortPool to 10.240.110.42:27017 b/c of error&lt;br/&gt;
java.net.SocketException: Connection timed out&lt;br/&gt;
	at java.net.SocketInputStream.socketRead0(Native Method)&lt;br/&gt;
	at java.net.SocketInputStream.read(SocketInputStream.java:129)&lt;br/&gt;
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)&lt;br/&gt;
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)&lt;br/&gt;
	at java.io.BufferedInputStream.read(BufferedInputStream.java:313)&lt;br/&gt;
	at org.bson.io.Bits.readFully(Bits.java:35)&lt;br/&gt;
	at org.bson.io.Bits.readFully(Bits.java:28)&lt;br/&gt;
	at com.mongodb.Response.&amp;lt;init&amp;gt;(Response.java:39)&lt;br/&gt;
	at com.mongodb.DBPort.go(DBPort.java:123)&lt;br/&gt;
	at com.mongodb.DBPort.go(DBPort.java:82)&lt;br/&gt;
	at com.mongodb.DBPort.call(DBPort.java:72)&lt;br/&gt;
	at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:202)&lt;br/&gt;
	at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:303)&lt;br/&gt;
	at com.mongodb.DBCollection.findOne(DBCollection.java:565)&lt;br/&gt;
	at com.mongodb.DBCollection.findOne(DBCollection.java:554)&lt;/p&gt;



&lt;p&gt;Is it possible for the driver to attempt recreate the connections and retry the query? It looks like the next query worked as expected.&lt;/p&gt;


&lt;p&gt;I do not see these errors on servers in the same data center as the primary mongodb server&lt;/p&gt;

&lt;p&gt;Note: latency between the my app server and the primary mongodb server is ~50 ms&lt;/p&gt;

&lt;p&gt;THANKS!&lt;/p&gt;</description>
                <environment></environment>
        <key id="25910">JAVA-481</key>
            <summary>Driver not retrying on Connection timed out SocketException</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="mariano.escribano@mongodb.com">Mariano Escribano</assignee>
                                    <reporter username="jed204">John Danner</reporter>
                        <labels>
                            <label>connection</label>
                            <label>driver</label>
                            <label>query</label>
                    </labels>
                <created>Fri, 2 Dec 2011 17:43:58 +0000</created>
                <updated>Wed, 3 Jan 2018 14:02:34 +0000</updated>
                            <resolved>Fri, 2 Dec 2011 18:48:23 +0000</resolved>
                                    <version>2.6.5</version>
                                                    <component>Performance</component>
                                        <votes>0</votes>
                                    <watches>7</watches>
                                                                                                                <comments>
                            <comment id="315346" author="brettcave" created="Wed, 17 Apr 2013 12:23:13 +0000"  >&lt;p&gt;This issue also occurs frequently in AWS EC2(Amazon Web Services). We see this daily. When reviewing mongo configuration for production, we came across the production checklist, which included a suggestion to drop the TCP_KEEPALIVE kernel configuration to a lower value. When dropping to 5 minutes, the errors started occurring much more frequently.  We have now set keepalive back to 7200 (at OS level, not driver level) to reduce the frequency of this occurring.&lt;/p&gt;


</comment>
                            <comment id="103767" author="jed204" created="Mon, 26 Mar 2012 21:14:36 +0000"  >&lt;p&gt;I think the issue is that the default linux keep-alive timeout is 2 hours prior to sending the first keep-alive packet and the default connection timeout on a Cisco ASA is 1 hour. This would suggest that despite setting the value it won&apos;t really do anything (in this situation/configuration).&lt;/p&gt;

&lt;p&gt;I&apos;ll adjust one of my system&apos;s keep-alive parameters to see if it will fix my issue - perhaps if someone else happens upon this ticket they will find it useful. If there is a desire for a more robust keep alive that isn&apos;t reliant on system setting/firewall settings I can submit this code as a patch.&lt;/p&gt;</comment>
                            <comment id="103755" author="antoine" created="Mon, 26 Mar 2012 20:40:34 +0000"  >&lt;p&gt;that should be the point of socketKeepAlive.&lt;br/&gt;
It just does a socket.setKeepAlive(true) on the java socket, which should then make the TCP send heartbeats on the socket.&lt;br/&gt;
It would be better if we could figure out why this is not fixing the firewall issue, rather than adding extra code / thread that does it at app layer.&lt;/p&gt;</comment>
                            <comment id="103753" author="jed204" created="Mon, 26 Mar 2012 20:33:53 +0000"  >&lt;p&gt;I should note I&apos;m using the socketKeepAlive option within the driver but the connectivity behavior remains&lt;/p&gt;</comment>
                            <comment id="103751" author="jed204" created="Mon, 26 Mar 2012 20:29:26 +0000"  >&lt;p&gt;I believe the root cause of the issue I&apos;m experiencing is a firewall closing down the inactive connections. The driver isn&apos;t aware of the connection status until it&apos;s passed it off to the application above.&lt;/p&gt;

&lt;p&gt;I&apos;ve updated the driver to periodically (configurable) issue the ping command to connections that have not been used in a configurable amount of time. I believe this will resolve the issue for me - would this patch be an acceptable addition to the default driver? &lt;/p&gt;

&lt;p&gt;The patch enables a monitoring thread to spin through the list of available DBPorts and checks on their last use, if it&apos;s over 15 minutes since use the ping command is issued which should keep the connection alive through a firewall or similar device.&lt;/p&gt;</comment>
                            <comment id="70512" author="antoine" created="Fri, 2 Dec 2011 18:48:23 +0000"  >&lt;p&gt;Feel free to follow up, though mongodb-user group may be a better place to get quick answers.&lt;br/&gt;
thx&lt;/p&gt;</comment>
                            <comment id="70510" author="antoine" created="Fri, 2 Dec 2011 18:46:21 +0000"  >&lt;p&gt;the driver never retries a read after a timeout exception.&lt;br/&gt;
It could lead to long delays and make whole system unstable if the reason for timeout is db overloaded.&lt;br/&gt;
Also the retry feature only retries to a different server for read, so you have to use slaveOk to allow reads from replica.&lt;/p&gt;

&lt;p&gt;Solutions for you:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;use slaveOk and driver will read from closest server (though your app must accept eventual consistency)&lt;/li&gt;
	&lt;li&gt;retry manually in your app.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrhbhr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>14662</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            </customfields>
    </item>
</channel>
</rss>