<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:53:28 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-39909] mongod killed by oom-killer</title>
                <link>https://jira.mongodb.org/browse/SERVER-39909</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;I am trying to gain insight into an ongoing issue we&apos;ve been having. Some configuration info:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;MongoDB 3.2.12&lt;/li&gt;
	&lt;li&gt;centos-release-6-8.el6.centos.12.3.x86_64 running in a 64GB VMware VM&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Two stand-alone instances of mongod run on this server: no sharding, no replication. There are several other memory consumers of memory on this server, mostly 5 or 6 Java programs only one of which consumes any significant memory (~16GB). And this one Java program is the main application that depends on MongoDB.&lt;/p&gt;

&lt;p&gt;What we observe is everything working OK for 2-5 days and then the oom-killer decides to kill one of the mongods. This has happened 5-6 times over the last month. Typically, only 1 mongod is killed but there was at least one occasion where both were.&lt;/p&gt;

&lt;p&gt;Output from &apos;sar&apos; shows the same pattern again and again: consumption of RAM, sometimes fairly rapid, over the 2-5 days, followed by ~61GB of RAM in use for a day or two and then the oom-killer does its thing.&lt;/p&gt;

&lt;p&gt;I should mention that I tried to constrain the WT cache size to 12GB for each of the two mongods. This seemed to prevent the oom-killer from firing, but our application became &apos;unresponsive&apos;.&lt;/p&gt;

&lt;p&gt;I should also mention that I&apos;ve read a ton of MongoDB Jiras on this issue and, while I know that 3.2.12 is getting long in the tooth, many of the improvements in WT&apos;s memory management were supposedly, though not exclusively, in 3.2.10.&lt;/p&gt;

&lt;p&gt;As to our application&apos;s &apos;access patterns&apos;, it&apos;s difficult to be precise but my sense is that it combines periodic bursts of write activity along with the occasional (human-driven) reading of a very large collection. (In this regard I am familiar with the Jiras that discuss MongoDB threads turning their attention to cache eviction rather than servicing application requests - but I believe that this issue was improved in 3.2.10).&lt;/p&gt;

&lt;p&gt;In any event, I would be very grateful for some bright light on this ongoing and very frustrating problem. At a minimum, if I could upload the WT diagnostics and someone at MongoDB could run their internal visualizer against it - along with an analysis, that would be a good start.&lt;/p&gt;

&lt;p&gt;Thank you for your help.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</description>
                <environment></environment>
        <key id="707454">SERVER-39909</key>
            <summary>mongod killed by oom-killer</summary>
                <type id="6" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14720&amp;avatarType=issuetype">Question</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="daniel.hatcher@mongodb.com">Danny Hatcher</assignee>
                                    <reporter username="saultocsin">PMB</reporter>
                        <labels>
                    </labels>
                <created>Fri, 1 Mar 2019 18:32:32 +0000</created>
                <updated>Mon, 6 May 2019 19:40:15 +0000</updated>
                            <resolved>Mon, 6 May 2019 19:40:15 +0000</resolved>
                                    <version>3.2.12</version>
                                                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="2224752" author="daniel.hatcher" created="Thu, 25 Apr 2019 15:36:14 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=saultocsin&quot; class=&quot;user-hover&quot; rel=&quot;saultocsin&quot;&gt;saultocsin&lt;/a&gt; are you still experiencing this issue?&lt;/p&gt;</comment>
                            <comment id="2172982" author="daniel.hatcher" created="Wed, 6 Mar 2019 18:19:46 +0000"  >&lt;p&gt;Additionally, as you may have noticed through your testing, you will need to manually decrease the WT cache size from its default if you continue to run multiple &lt;tt&gt;mongod&lt;/tt&gt; processes on the same server. The default currently uses a percentage of the total amount of RAM available on the box and does not account for other processes when doing so.&lt;/p&gt;</comment>
                            <comment id="2172663" author="daniel.hatcher" created="Wed, 6 Mar 2019 15:41:09 +0000"  >&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;I apologize; I should have been more clear regarding my suggestion of cgroups. There&apos;s an issue we&apos;ve commonly seen in the past when a &lt;tt&gt;mongod&lt;/tt&gt; process is co-located with other processes on a server, especially other &lt;tt&gt;mongod&lt;/tt&gt; processes. If another process starts to cannibalize more memory on the system, say a different &lt;tt&gt;mongod&lt;/tt&gt; receives a sharp increase in load, it can cause a cascading effect where your original &lt;tt&gt;mongod&lt;/tt&gt; all of a sudden has less memory to allocate to itself. If you have the machine resources pre-allocated in cgroups, a rise in memory in one process has a much better chance of not killing other processes. However, as you said this will not prevent a single &lt;tt&gt;mongod&lt;/tt&gt; from running out of memory so it was simply something to take into consideration moving forward.&lt;/p&gt;

&lt;p&gt;In terms of the memory issue, upgrading would be your best bet. The primary reason I suggested going to the latest version of 3.4 instead of 3.6 or 4.0 was that there are less considerations for &lt;a href=&quot;https://docs.mongodb.com/manual/release-notes/3.4-compatibility/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;compatibility changes&lt;/a&gt; as compared to your current version of 3.2. If you feel comfortable upgrading even further, please do so! Whatever version you decide, please let me know if the upgrade resolves the issue or if you are still experiencing memory pressure. I can take a further look then.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;/p&gt;

&lt;p&gt;Danny&lt;/p&gt;</comment>
                            <comment id="2172401" author="saultocsin" created="Wed, 6 Mar 2019 12:22:52 +0000"  >&lt;p&gt;Hi Daniel,&lt;/p&gt;

&lt;p&gt;Thank you for getting back to me about this and apologies for my delayed&lt;br/&gt;
reply.&lt;/p&gt;

&lt;p&gt;I have considered, and even experimented with, Linux&apos;s overcommit tuning&lt;br/&gt;
knobs. But unless I am sorely mistaken, they don&apos;t really address the&lt;br/&gt;
issue. Rather, they shift the failure from oom-killer to the application&lt;br/&gt;
itself, i.e., a MongoDB malloc call might fail which I suspect will cause&lt;br/&gt;
the process&apos;s termination.&lt;/p&gt;

&lt;p&gt;Can you explain how the use of cgroups might solve WT&apos;s memory problem?&lt;/p&gt;

&lt;p&gt;Might I also ask why you recommend 3.4.19 rather than, say, 3.6 .11or a 4.X?&lt;/p&gt;

&lt;p&gt;Thanks again.&lt;/p&gt;

&lt;p&gt;On Fri, Mar 1, 2019 at 5:01 PM Daniel Hatcher (Jira) &amp;lt;jira@mongodb.org&amp;gt;&lt;/p&gt;
</comment>
                            <comment id="2168046" author="daniel.hatcher" created="Fri, 1 Mar 2019 22:00:27 +0000"  >&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;I&apos;m sorry to hear that you&apos;ve been having these extended issues. Unfortunately, as you&apos;ve pointed out 3.2.12 is an old version and has officially reached end-of-life status. There were some improvements that were made in 3.2.10 but improving WiredTiger performance was and continues to be an ongoing process. Additionally, 3.2.x lacks some diagnostic capability that future versions provide. Would it be possible for you to upgrade to 3.4.19? If you do, we can take a look at the status of your cluster that results.&lt;/p&gt;

&lt;p&gt;As a note, we generally recommend using one &lt;tt&gt;mongod&lt;/tt&gt; process per server with no major competition in system resources. If you must host everything on the same server, you may wish to try using a mechanism like cgroups to assign resources to your &lt;tt&gt;mongod&lt;/tt&gt; processes.You may be able to avoid some OOM events if you don&apos;t have the &lt;tt&gt;mongod&lt;/tt&gt; processes competing with each other for available RAM.&lt;/p&gt;

&lt;p&gt;Thank you,&lt;/p&gt;

&lt;p&gt;Danny&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 1 Mar 2019 22:00:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 41 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>daniel.hatcher@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 41 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>daniel.hatcher@mongodb.com</customfieldvalue>
            <customfieldvalue>saultocsin</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hup7mn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|huexkv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|huotvz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>