<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:54:37 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-20571] mongod invoked oom-killer in 3.0.6</title>
                <link>https://jira.mongodb.org/browse/SERVER-20571</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We have a single replica set with 3 nodes running 3.0.6 with WiredTiger. The hosts are running 14.04 LTS Ubuntu in AWS. The instance that we use is I2.8xlarge with 240GB RAM. &lt;/p&gt;

&lt;p&gt;Today we had a primary crash with oom-killer. Another primary was elected and within about 5 minutes even that server invoked oom-killer.&lt;/p&gt;

&lt;p&gt;Here is one those server&apos;s syslog output : &lt;a href=&quot;http://pastebin.com/Q7bQP34V&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://pastebin.com/Q7bQP34V&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="231058">SERVER-20571</key>
            <summary>mongod invoked oom-killer in 3.0.6</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="ramon.fernandez@mongodb.com">Ramon Fernandez Marina</assignee>
                                    <reporter username="aivaturi@paloaltonetworks.com">Aditya Ivaturi</reporter>
                        <labels>
                    </labels>
                <created>Tue, 22 Sep 2015 21:20:27 +0000</created>
                <updated>Mon, 15 Nov 2021 16:54:10 +0000</updated>
                            <resolved>Wed, 28 Oct 2015 17:03:12 +0000</resolved>
                                                                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="1073008" author="ramon.fernandez" created="Wed, 28 Oct 2015 17:03:01 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;, I&apos;m going to close this ticket for now as there&apos;s not enough information for us to investigate the issue. If you manage to reproduce the OOM kill while capturing the &lt;tt&gt;serverStatus&lt;/tt&gt; data requested above please upload such data and we&apos;ll reopen the ticket.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1051437" author="aivaturi@paloaltonetworks.com" created="Mon, 5 Oct 2015 18:08:10 +0000"  >&lt;p&gt;Not yet, at least not where it mattered. The oom-killer happened, but on an arbiter and I wasn&apos;t running these scripts there. Although the arbiter had only 1GB RAM and nothing else was running on it, it still is surprising that oom is getting invoked in arbiters as well.&lt;/p&gt;</comment>
                            <comment id="1050497" author="ramon.fernandez" created="Sat, 3 Oct 2015 12:58:43 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;, where you able to reproduce this behavior while collecting &lt;tt&gt;serverStatus&lt;/tt&gt; metrics?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1043701" author="ramon.fernandez" created="Fri, 25 Sep 2015 17:56:29 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;, unfortunately without the &lt;tt&gt;serverStatus&lt;/tt&gt; metrics we asked for it&apos;s not possible for us to tell if the behavior you&apos;re seeing is due to a bug in memory management or a consequence of your specific configuration.&lt;/p&gt;

&lt;p&gt;Since it seems the problem does appear often enough I think the best way forward is to start collecting &lt;tt&gt;serverStatus&lt;/tt&gt; metrics and wait for another crash &amp;#8211; then upload the &lt;tt&gt;ss.log&lt;/tt&gt;, &lt;tt&gt;iostat.log&lt;/tt&gt; and &lt;tt&gt;mongod.log&lt;/tt&gt; files.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1043640" author="aivaturi@paloaltonetworks.com" created="Fri, 25 Sep 2015 17:17:02 +0000"  >&lt;p&gt;No, I was not collecting those metrics at the time of crash. I will get the log file for rs0-2 as you requested.&lt;/p&gt;</comment>
                            <comment id="1043237" author="ramon.fernandez" created="Fri, 25 Sep 2015 11:58:39 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;, the &lt;tt&gt;atop&lt;/tt&gt; information shows increased memory and CPU utilization, but it&apos;s not enough to investigate where that memory is going or why CPU usage grows. This information is provided by the &lt;tt&gt;serverStatus&lt;/tt&gt; metrics that are being collected in the &lt;tt&gt;ss.log&lt;/tt&gt; file (the data in &lt;tt&gt;iostat.log&lt;/tt&gt; is only only useful when correlated with the data in &lt;tt&gt;ss.log&lt;/tt&gt;).&lt;/p&gt;

&lt;p&gt;If you were collecting &lt;tt&gt;serverStatus&lt;/tt&gt; metrics at the time rs0-2 crashed then the &lt;tt&gt;ss.log&lt;/tt&gt; file may already contain enough information for us to investigate further. Can you please upload the &lt;tt&gt;ss.log&lt;/tt&gt; and &lt;tt&gt;iostat.log&lt;/tt&gt; files from rs0-2 so we can take a look? Can you also include the &lt;tt&gt;mongod.log&lt;/tt&gt; file from rs0-2 from the last startup to the time of the crash? You can &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/683e6509-23dc-4f3a-a69b-e0a9e9309f54.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;use this upload portal&lt;/a&gt; to send files privately and securely (i.e.: they won&apos;t be publicly visible in JIRA).&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1042882" author="aivaturi@paloaltonetworks.com" created="Fri, 25 Sep 2015 00:15:28 +0000"  >&lt;p&gt;I&apos;m still running the commands that you requested and will upload the output tomorrow morning. &lt;/p&gt;

&lt;p&gt;But in the meantime, I managed to grab the atop output around the time when rs0-2 crashed. To view this file, you need to install atop (&lt;a href=&quot;http://atoptool.nl/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://atoptool.nl/&lt;/a&gt;). The only problem is that I was capturing at 5 minute interval, which still gives us what happened with mongod, but the granularity suffers a bit. &lt;/p&gt;

&lt;p&gt;Here are the steps to view the snapshot:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;tar xzf atop_20150922.tar.gz&lt;/li&gt;
	&lt;li&gt;atop -r atop_20150922&lt;/li&gt;
	&lt;li&gt;In the window, press &quot;m&quot; (shows the memory view)&lt;/li&gt;
	&lt;li&gt;Press &quot;b&quot; and enter 20:00. The process crashes around 20:35, so we can start seeing the memory footprint&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;At 20:00, you can see that PAG (Page scans) has already kicked in. &lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;MEM | tot   240.2G  | free    5.8G  | cache  41.8G  |               | dirty   2.8M  | buff    4.5M |  slab    1.9G |               |               |               |               |               |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;SWP | tot     0.0M  | free    0.0M  |               |               |               |              |               |               |               |               |  vmcom 240.3G |  vmlim 120.1G |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;PAG | scan    3270  |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;and mongod resident memory is growing&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;81342          188845               0           20340K          242.0G           188.3G           2056K           14216K          mongodb            mongodb            78%           mongod&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;Now press &quot;t&quot; to skip ahead 5 minutes (unfortunately that is the granularity that  it was configured with, which I will change):&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;81342          1113e3               0           20340K          242.0G           191.8G           2056K             3.5G          mongodb            mongodb            80%           mongod&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;As you can see the RGROW is at 3.5G now. And with another couple of jumps of snapshot and you end up with at this state:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;MEM | tot   240.2G  | free    1.3G  | cache  18.4G  |               | dirty   7.9M  | buff    0.4M |  slab    1.7G |               |               |               |               |               |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;SWP | tot     0.0M  | free    0.0M  |               |               |               |              |               |               |               |               |  vmcom 240.3G |  vmlim 120.1G |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;PAG | scan  7934e3  |&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&amp;nbsp;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;81342          2401e3               0           20340K          242.0G           216.4G           2056K             7.7G          mongodb            mongodb            90%           mongod&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;By this time, mongod has go from 78% memory utilization to 90% and still growing. Kernel is really freaking out and trying really hard with page scans. Notice how the slab also went down from 1.9G to 1.7G at this point. Also there was appreciable disk activity in those 15 minutes as well. After this snapshot, the next snapshot shows a calm system as oom-killer has already kicked in.&lt;/p&gt;

&lt;p&gt;You can also press &quot;g&quot; to get a more generic system snapshot with CPU. Even  those show mongod process going crazy. &lt;/p&gt;</comment>
                            <comment id="1041327" author="aivaturi@paloaltonetworks.com" created="Wed, 23 Sep 2015 20:49:15 +0000"  >&lt;p&gt;BTW, I do have atop running on both those machines. Would that help? It will get you more data than iostat anyway.&lt;/p&gt;</comment>
                            <comment id="1041210" author="ramon.fernandez" created="Wed, 23 Sep 2015 19:15:28 +0000"  >&lt;p&gt;Yes, the wiredTigerCacheSizeGB parameter limits exactly that: the amount of memory used for WT cache. &lt;tt&gt;mongod&lt;/tt&gt; needs additional memory for other things. There are users running with 16GB, so it&apos;s not clear yet whether the issue is from the WT cache setting. Hopefully the data collection will help us in this respect.&lt;/p&gt;</comment>
                            <comment id="1041206" author="aivaturi@paloaltonetworks.com" created="Wed, 23 Sep 2015 19:10:29 +0000"  >&lt;p&gt;Sure, I&apos;ll run those. &lt;/p&gt;

&lt;p&gt;I do have a followup question. Almost all the RAM was being consumed by mongod process. We did set the cache size limit to 224GB. In our case, where the system RAM is 240GB, is that too hight a limit? Does mongod process consume more memory than the maximum limit set in cache size?&lt;/p&gt;</comment>
                            <comment id="1041132" author="ramon.fernandez" created="Wed, 23 Sep 2015 18:07:36 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;, looks like the memory consumption was already quite high on these nodes, and some operation pushed them over the edge enough to trigger the OOM killer. Unfortunately without the ability to reproduce this behavior it will be difficult to understand what&apos;s causing it.&lt;/p&gt;

&lt;p&gt;What we ask users in these cases is that they collect server data by running the commands below on affected nodes from an OS shell: &lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;delay=10&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;mongo --&lt;/span&gt;&lt;span style=&quot;color: #ff1493; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;eval&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;while(true) {print(JSON.stringify(db.serverStatus({tcmalloc:1}))); sleep($delay*1000)}&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; &amp;gt;ss.log &amp;amp;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;iostat -k -t -x $delay &amp;gt;iostat.log &amp;amp;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;These commands will gather data every 10 seconds and save it on the &lt;tt&gt;ss.log&lt;/tt&gt; and &lt;tt&gt;iostat.log&lt;/tt&gt; files. If the problem reappears while data is being collected that may tell us what&apos;s causing this issue; if the problem doesn&apos;t reappear there may still be useful information in the collected data that can help us understand why these nodes are consuming the amount of memory they are consuming.&lt;/p&gt;

&lt;p&gt;Could you please run the above commands in all nodes of this replica set? I would let them run for a couple of days, or shorter if the problem described in this ticket reproduces earlier.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;
</comment>
                            <comment id="1040758" author="ramon.fernandez" created="Wed, 23 Sep 2015 12:47:29 +0000"  >&lt;p&gt;Thanks for the information you sent &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;. We&apos;ve seen issues with excessive memory consumptions during index builds (&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-20159&quot; title=&quot;Out of memory on index build during initial sync even with low cacheSize parameter&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-20159&quot;&gt;&lt;del&gt;SERVER-20159&lt;/del&gt;&lt;/a&gt;), but the logs don&apos;t show any index builds at the time of the incident so that theory is out.&lt;/p&gt;

&lt;p&gt;We&apos;re looking at the logs and MMS info to see what we find, stay tuned.&lt;/p&gt;</comment>
                            <comment id="1040568" author="aivaturi@paloaltonetworks.com" created="Wed, 23 Sep 2015 05:17:54 +0000"  >&lt;p&gt;rs0-2 was initially primary, it crashed and then rs0-1 became primary, which was the second one to crash. I had to clean up the logs to remove the actual update statements, but the actual command log line is still there. If you need the actual query, please reach out to me directly and I&apos;ll share them that way. &lt;/p&gt;

&lt;p&gt;The update commands showed taking insane amount of time in the logs. They normally don&apos;t. This could be related to the spike in CPU load, but the query themselves shouldn&apos;t have caused the load - at least we haven&apos;t seen that before.&lt;/p&gt;</comment>
                            <comment id="1040435" author="aivaturi@paloaltonetworks.com" created="Wed, 23 Sep 2015 00:21:37 +0000"  >&lt;p&gt;Sure, I prefer to send it via email. Which email can I send it to?&lt;/p&gt;</comment>
                            <comment id="1040413" author="ramon.fernandez" created="Tue, 22 Sep 2015 23:42:33 +0000"  >&lt;p&gt;Sorry you&apos;ve run into this &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=ivaturipan&quot; class=&quot;user-hover&quot; rel=&quot;ivaturipan&quot;&gt;ivaturipan&lt;/a&gt;. We&apos;ll need more information to find out what&apos;s going on:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Can you please send the &lt;tt&gt;mongod.log&lt;/tt&gt; file for the servers that were terminated by the OOM killer? For the second server it would be very useful to see the time from its election as a primary until the OS terminated it&lt;/li&gt;
	&lt;li&gt;Can you provide some more details about the load that these nodes were under at the time of the incident?&lt;/li&gt;
	&lt;li&gt;Can you send the link to this deployment in MMS so we can take a closer look? (Feel free to send this via email if you prefer)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1040306" author="aivaturi@paloaltonetworks.com" created="Tue, 22 Sep 2015 21:35:11 +0000"  >&lt;p&gt;Two other curious things: &lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;The cache activity rose dramatically just before the crash in both nodes (second screenshot).&lt;/li&gt;
	&lt;li&gt;Before the first node crashed, the data size reported by MMS was around 450GB. But after the first node crashed, MMS reported it around 900GB. But after the second node crashed and was restarted, MMS again started reporting data as around 450GB.&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="1040286" author="aivaturi@paloaltonetworks.com" created="Tue, 22 Sep 2015 21:23:46 +0000"  >&lt;p&gt;The screenshot of MMS shows the increase in memory and when the oom triggers&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="91010" name="Screen Shot 2015-09-22 at 2.22.05 PM.png" size="74662" author="ivaturipan" created="Tue, 22 Sep 2015 21:23:46 +0000"/>
                            <attachment id="91014" name="Screen Shot 2015-09-22 at 2.30.58 PM.png" size="101377" author="ivaturipan" created="Tue, 22 Sep 2015 21:35:11 +0000"/>
                            <attachment id="91322" name="atop_20150922.tar.gz" size="2859246" author="ivaturipan" created="Fri, 25 Sep 2015 00:15:28 +0000"/>
                            <attachment id="91043" name="rs0-1 load average.png" size="166382" author="ivaturipan" created="Wed, 23 Sep 2015 05:17:54 +0000"/>
                            <attachment id="91044" name="rs0-1.log" size="533540" author="ivaturipan" created="Wed, 23 Sep 2015 05:17:54 +0000"/>
                            <attachment id="91045" name="rs0-2 load average.png" size="150645" author="ivaturipan" created="Wed, 23 Sep 2015 05:17:54 +0000"/>
                            <attachment id="91046" name="rs0-2.log" size="33019" author="ivaturipan" created="Wed, 23 Sep 2015 05:17:54 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>17.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 22 Sep 2015 23:42:33 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        8 years, 16 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            8 years, 16 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>aivaturi@paloaltonetworks.com</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrktwn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsd9sv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrhwyv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>