<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:50:04 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-19176] Mongo instance dies while creating a large replica: </title>
                <link>https://jira.mongodb.org/browse/SERVER-19176</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I&apos;m adding a new replica to a mongo replica set with a huge ~3Tb single collection in it. The collection is very active, which means that while Mongo is seeding the data from the primary and building its primary key index, huge amounts of changes occur in the table. &lt;/p&gt;

&lt;p&gt;The problem occurs when this new replica starts to catch up on oplog and performs checks against the primary. It looks like Mongo creates a new connection for each document (500+ new connections a second created from the replica to the primary according to our monitoring). After some time of that intense flood of connections we hit the limit on ephemeral source ports available for use because connections spend some time in TIME_WAIT before being recycled (even with tw_reuse and tw_recycle connection rate is high enough to exhaust all the ports). &lt;/p&gt;

&lt;p&gt;Each time we hit the issue, Mongo receives a connection error (errno 99: Cannot assign requested address) from the kernel. There is logic in &lt;tt&gt;SyncTail::getMissingDoc&lt;/tt&gt; to perform up to 3 retries before giving up and in some cases it helps &#8211; kernel manages to free up some ports and connection succeeds upon retry. But since it takes a huge amount of time to seed the servers and build the primary index (up to 24 hours), the amounts of changes Mongo needs to apply to catch up with oplog is huge, so there are many chances for it to fail all 3 retries and die with an &lt;tt&gt;Assertion: 15916:Can no longer connect to initial sync source&lt;/tt&gt; message.&lt;/p&gt;

&lt;p&gt;At the moment I&apos;m trying to create the replica for the 5th time (each try taking 1-2 days) and so far I&apos;ve managed to catch the moment mongo started hitting the limit of 60K TIME_WAIT sockets in time to be able to run &lt;tt&gt;cpulimit&lt;/tt&gt; on the process to slow it down and reduce the rate of socket creation enough so that the kernel could recycle those sockets fast enough, but this is clearly an issue that needs fixing in the code.&lt;/p&gt;

&lt;p&gt;Some additional resources:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;TIME_WAIT sockets count on the replica during the incident: &lt;a href=&quot;https://www.evernote.com/l/ADQS6_1-TVpGKpMgSiHHxX8f0AG3Yvhe-ZgB/image.png&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.evernote.com/l/ADQS6_1-TVpGKpMgSiHHxX8f0AG3Yvhe-ZgB/image.png&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;Netstat excerpt with TIME_WAIT sockets during the incident: &lt;a href=&quot;https://www.evernote.com/l/ADRM1hjdAeZJFZnjPTwv-9VnxDBweX3kZvsB/image.png&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.evernote.com/l/ADRM1hjdAeZJFZnjPTwv-9VnxDBweX3kZvsB/image.png&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;TCP stack status graph, showing 500-600 new connections established a second from the replica: &lt;a href=&quot;https://www.evernote.com/l/ADQku5K0WhtEgZ4q0Lei5mHEG9sRpfZHpiAB/image.png&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.evernote.com/l/ADQku5K0WhtEgZ4q0Lei5mHEG9sRpfZHpiAB/image.png&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="213573">SERVER-19176</key>
            <summary>Mongo instance dies while creating a large replica: </summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="ramon.fernandez@mongodb.com">Ramon Fernandez Marina</assignee>
                                    <reporter username="kovyrin">Oleksiy Kovyrin</reporter>
                        <labels>
                    </labels>
                <created>Sun, 28 Jun 2015 15:29:24 +0000</created>
                <updated>Mon, 3 Aug 2015 21:39:10 +0000</updated>
                            <resolved>Mon, 3 Aug 2015 21:39:10 +0000</resolved>
                                    <version>2.6.10</version>
                    <version>3.0.4</version>
                                                    <component>Networking</component>
                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>7</watches>
                                                                                                                <comments>
                            <comment id="995516" author="ramon.fernandez" created="Mon, 3 Aug 2015 21:38:53 +0000"  >&lt;p&gt;Thanks for the detailed report &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=kovyrin&quot; class=&quot;user-hover&quot; rel=&quot;kovyrin&quot;&gt;kovyrin&lt;/a&gt;. This issue was reported before in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-18721&quot; title=&quot;Initial-sync can exhaust available ports with rapid short-lived connections&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-18721&quot;&gt;&lt;del&gt;SERVER-18721&lt;/del&gt;&lt;/a&gt;, which we want to address in the current development cycle.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-18721&quot; title=&quot;Initial-sync can exhaust available ports with rapid short-lived connections&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-18721&quot;&gt;&lt;del&gt;SERVER-18721&lt;/del&gt;&lt;/a&gt; has a posted workaround, which may or may not work in your case depending on the specifics of your workload. The alternative is to &lt;a href=&quot;http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/#sync-by-copying-data-files-from-another-member&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;manually copy the data files first&lt;/a&gt; and then sync the secondary node; this method should drastically reduce the need to create so many connections.&lt;/p&gt;

&lt;p&gt;I&apos;m going to mark this ticket as a duplicate of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-18721&quot; title=&quot;Initial-sync can exhaust available ports with rapid short-lived connections&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-18721&quot;&gt;&lt;del&gt;SERVER-18721&lt;/del&gt;&lt;/a&gt;. Feel free to vote for that ticket and watch it for updates.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="207163">SERVER-18721</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 3 Aug 2015 21:38:53 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        8 years, 28 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            8 years, 28 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>kovyrin</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrl21j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsaqlb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;Get a huge mongo collection (3Tb), put it under an intense create/update/delete load (100+ updates sec), try to add a replica to the given replicaset, see it fail every time.&lt;/p&gt;</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrlgsf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>