<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 08:37:25 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[GODRIVER-1920] Aggregate can time out without setting socket options</title>
                <link>https://jira.mongodb.org/browse/GODRIVER-1920</link>
                <project id="14289" key="GODRIVER">Go Driver</project>
                    <description>&lt;p&gt;It seems that aggregate can occasionally cause a timeout with a tcp error like&lt;br/&gt;
unable to decode message length: read tcp 10.1.129.128:65170-&amp;gt;3.216.165.122:27017: read: operation timed out&lt;br/&gt;
even when no options related to timeouts have been set.&lt;/p&gt;

&lt;p&gt;This operation should not timeout, as the default socket options are supposed to block indefinitely. Some timeout, either default or manually-set, is causing this failure.&lt;/p&gt;</description>
                <environment></environment>
        <key id="1649534">GODRIVER-1920</key>
            <summary>Aggregate can time out without setting socket options</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.mongodb.org/images/icons/priorities/minor.svg">Minor - P4</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="benji.rewis@mongodb.com">Benji Rewis</reporter>
                        <labels>
                    </labels>
                <created>Mon, 15 Mar 2021 23:05:14 +0000</created>
                <updated>Wed, 30 Mar 2022 23:44:24 +0000</updated>
                                                                                                <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="4064000" author="stephen.white" created="Wed, 15 Sep 2021 19:02:43 +0000"  >&lt;p&gt;An interesting point discovered in the &lt;a href=&quot;https://jira.mongodb.org/browse/MHOUSE-2004&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;https://jira.mongodb.org/browse/MHOUSE-2004&lt;/a&gt;&#160;investigation, was that the macOS specific issue only happened when the VPN was active on corp issued laptops. It appears that the VPN may be filtering TCP keepalive packets.&lt;/p&gt;</comment>
                            <comment id="3673731" author="benji.rewis" created="Fri, 19 Mar 2021 15:55:39 +0000"  >&lt;p&gt;Moving this back to Open, as the investigation did not yield anything productive. To summarize, all we&apos;ve learned is that the early timeouts seem to happen only on MacOS when connected to the particular data lake listed in the original code. It may be an issue with MacOS TCP keepalive configuration on that data lake.&lt;/p&gt;</comment>
                            <comment id="3672126" author="benji.rewis" created="Thu, 18 Mar 2021 20:05:42 +0000"  >&lt;p&gt;Hello &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=david.golden&quot; class=&quot;user-hover&quot; rel=&quot;david.golden&quot;&gt;david.golden&lt;/a&gt; !&lt;/p&gt;

&lt;p&gt;We haven&#8217;t found a root cause yet in the Go driver. I&#8217;ve been able to reproduce the early timeout with the code you provided, but it&#8217;s unclear to me if/where that timeout is getting set in the Go driver.&lt;/p&gt;

&lt;p&gt;Hypothetically, a read timeout would be set &lt;a href=&quot;https://github.com/mongodb/mongo-go-driver/blob/master/x/mongo/driver/topology/connection.go#L392&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt; in readWireMessage, but from what I can tell, the only non-zero (a zero timeout means block infinitely) timeout that is set is 10s for heartbeats, which is not the cause of the larger timeout.&lt;/p&gt;

&lt;p&gt;I&#8217;ve been testing on MacOS, and I find it odd that Linux doesn&#8217;t have the same issue. That made me think it&#8217;s some complication with a Mac tcp keep alive configuration (see Mac tcp options &lt;a href=&quot;https://developer.apple.com/documentation/network/tcp_options&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The issue persists on Golang version 1.14.15, 1.15.15 and 1.16.2 (latest) which means that upgrading your Go version will probably not help.&lt;/p&gt;

&lt;p&gt;Interestingly, I cannot reproduce the error on a local replica set. I set a fail point to block an aggregate for 10 minutes, and the query still succeeds.&lt;/p&gt;

&lt;p&gt;I&#8217;ve seen similar issues in the TOOLS repo (&lt;a href=&quot;https://jira.mongodb.org/browse/TOOLS-638&quot; title=&quot;Mongodump fails with message: read tcp 127.0.0.1:27017: i/o timeout&quot; class=&quot;issue-link&quot; data-issue-key=&quot;TOOLS-638&quot;&gt;&lt;del&gt;TOOLS-638&lt;/del&gt;&lt;/a&gt;), but they very clearly involved sessions/transactions, which this issue does not.&lt;/p&gt;</comment>
                            <comment id="3665779" author="david.golden" created="Mon, 15 Mar 2021 23:27:36 +0000"  >&lt;p&gt;Notes on the situation that led me to ask on Slack:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Client connected to an Atlas Data Lake in the Cloud staging environment&lt;/li&gt;
	&lt;li&gt;Driver version was v1.1.3 &amp;#8211; this was not intentional, somehow this is what Go found in my environment.  I&apos;ll recheck with a newer version
	&lt;ul&gt;
		&lt;li&gt;Same result with v1.5.0&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Code is &lt;a href=&quot;https://gist.github.com/xdg/fa9c103229b493a02a8e5ed0dc652bbc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;I don&apos;t know how long it took before the timeout. I&apos;ll add instrumentation and recheck.
	&lt;ul&gt;
		&lt;li&gt;Timeout occurs around 5min 25-27sec (note, no pre-warm pool, so variability could be related to connection/handshake time)&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;This was on Mac OS X, Big Sur, Intel, Go 1.14.15; I&apos;ll try on linux, too.
	&lt;ul&gt;
		&lt;li&gt;On Linux (Go 1.14.15), I did not see timeouts.  All queries eventually got responses over 27 minutes.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I can provide credentials to the data lake if that&apos;s helpful for repro.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10257" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Documentation Changes</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="11861"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hykszb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            </customfields>
    </item>
</channel>
</rss>