<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:18:17 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-28488] WT Performance drops to zero during cache eviction with cache full</title>
                <link>https://jira.mongodb.org/browse/SERVER-28488</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;We are currently benchmarking our workload against the following test environment:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;MongoDB 3.2.11&lt;/li&gt;
	&lt;li&gt;Sharded Cluster (2 shards with 1 or 2 nodes in the replica set)&lt;/li&gt;
	&lt;li&gt;Each node has 2x20 core CPUs and 32GB of memory&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Our synthetic workload consists of:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;450 threads doing an insert, waiting for 1 to 100ms and then updating the same document querying by the _id.&lt;/li&gt;
	&lt;li&gt;We are using 2 mongos.&lt;/li&gt;
	&lt;li&gt;We uses mgo golang driver, which appears to skew connections to one of the two mongos.&lt;/li&gt;
	&lt;li&gt;We have also included our workload benchmarks in the upload.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;We routinely see database operations drop from around 12000/s to 0 ops/s for several seconds at a time during what we believe to be WT cache eviction. After tuning the number of WT evictions threads to 8, eviction trigger(70), eviction target(68), eviction_dirty_trigger(20), eviction_dirty_target(18), we get a far more stable throughput of ~4000 ops/s with drops to 0 ops/s for only a second or so.&lt;/p&gt;

&lt;p&gt;During the drops to 0 ops/s, we see no disk IO (we&apos;re using ZFS with a 1GB L2 cache, no L2ARC, verified with iostat) and we become CPU bound.&lt;/p&gt;

&lt;p&gt;We are fully aware that we are pushing the system really hard but we didn&apos;t expect this behaviour during cache eviction. This issue does not occur when we are only running 200 worker threads against the system.&lt;/p&gt;</description>
                <environment>MongoDB 3.2.11 with WiredTiger on FreeBSD 11. &lt;br/&gt;
&lt;br/&gt;
Sharded Cluster&lt;br/&gt;
-shard 1 with 2 node in the replica set&lt;br/&gt;
-shard 2 with 1 node in the replica set</environment>
        <key id="367613">SERVER-28488</key>
            <summary>WT Performance drops to zero during cache eviction with cache full</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="kelsey.schubert@mongodb.com">Kelsey Schubert</assignee>
                                    <reporter username="weishan">Wei Shan Ang</reporter>
                        <labels>
                    </labels>
                <created>Fri, 24 Mar 2017 18:26:22 +0000</created>
                <updated>Wed, 21 Jun 2017 21:23:31 +0000</updated>
                            <resolved>Mon, 17 Apr 2017 18:35:49 +0000</resolved>
                                    <version>3.2.11</version>
                                                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="1567580" author="weishan" created="Tue, 9 May 2017 09:16:31 +0000"  >&lt;p&gt;Hi @Ramon Fernandez,&lt;/p&gt;

&lt;p&gt;Please note that we are already running 3.2.12 and we are still seeing the same issue.&lt;/p&gt;

&lt;p&gt;Could you re-open the ticket please?&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Wei Shan&lt;/p&gt;</comment>
                            <comment id="1554618" author="domodwyer" created="Fri, 21 Apr 2017 13:57:36 +0000"  >&lt;p&gt;Hi @Alexander Gorrod - I&apos;ve been working with @Wei Shan Ang to test the throughput of mongodb for our usage scenario. We&apos;ve been collecting various stats while running a workload generator that inserts a document, and then updates the document to add roughly 4k bytes of data after a small delay. We are currently testing against version 3.2.12.&lt;/p&gt;

&lt;p&gt;Our workload generator is the only application running on a 40 CPU core box, connected via a 1Gbps LAN to mongos - it performs &lt;b&gt;no IO other than talking to mongo&lt;/b&gt;, it&apos;s sole purpose is to load test mongo. Monitoring the network interface, we never come remotely close to maxing the link, and we&apos;re nowhere near maxing the CPU of the host either.&lt;/p&gt;

&lt;p&gt;We&apos;ve correlated the drops to 0 throughput with a declining TrackedDirtyBytesInTheCache delta (measured every second using db.serverStatus()) which is what led us to believe it was related to the cache eviction. During periods of 0 throughput, the mongod boxes become CPU bound (also 40 core boxes) - this is what led us to conclude mongo was stalling as opposed to anything on application side. See the attached graph which is a plot of throughput vs. mongod CPU usage.&lt;/p&gt;

&lt;p&gt;After resizing the wiredtiger cache from 19GB to 1GB (with wiredTiger.engineConfig.cacheSizeGB=1) the throughput drops are much shorter in duration but still noticable. We also tested setting &lt;tt&gt;eviction_dirty_trigger=10&lt;/tt&gt; and &lt;tt&gt;eviction_dirty_target=7&lt;/tt&gt;  (with a 19GB cache) as you suggested we still see periods of 0 throughput.&lt;/p&gt;

&lt;p&gt;I have uploaded new diagnostic data from a mongod instance, and below are a couple of links to various stats we&apos;ve recorded during various (different) testing runs in the hope they help. If you agree this is a mongo server issue, can we please re-open the ticket? I can run tests and provide any info needed over the next couple of weeks.&lt;/p&gt;

&lt;p&gt;CPU + disk IO stats: &lt;a href=&quot;https://docs.google.com/a/itsallbroken.com/spreadsheets/d/1o6N23E_Nc1dBnFC_XXs6qWAXcdKAGjlgRuGHuVEXzn0/edit?usp=sharing&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.google.com/a/itsallbroken.com/spreadsheets/d/1o6N23E_Nc1dBnFC_XXs6qWAXcdKAGjlgRuGHuVEXzn0/edit?usp=sharing&lt;/a&gt;&lt;br/&gt;
db.serverStatus() + TrackedDirtyBytesInTheCache plot (switch sheets): &lt;a href=&quot;https://docs.google.com/a/itsallbroken.com/spreadsheets/d/1tsK8vOun4IHyfhpA7mXQ9GrO9zneoWvL2dENj2EImug/edit?usp=sharing&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.google.com/a/itsallbroken.com/spreadsheets/d/1tsK8vOun4IHyfhpA7mXQ9GrO9zneoWvL2dENj2EImug/edit?usp=sharing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;a id=&quot;154467_thumb&quot; href=&quot;https://jira.mongodb.org/secure/attachment/154467/154467_Screen+Shot+2017-04-21+at+14.23.35.png&quot; title=&quot;Screen Shot 2017-04-21 at 14.23.35.png&quot; file-preview-type=&quot;image&quot; file-preview-id=&quot;154467&quot; file-preview-title=&quot;Screen Shot 2017-04-21 at 14.23.35.png&quot;&gt;&lt;img src=&quot;https://jira.mongodb.org/secure/thumbnail/154467/_thumb_154467.png&quot; style=&quot;border: 0px solid black&quot; role=&quot;presentation&quot;/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;</comment>
                            <comment id="1552970" author="ramon.fernandez" created="Wed, 19 Apr 2017 19:35:38 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=vgalu&quot; class=&quot;user-hover&quot; rel=&quot;vgalu&quot;&gt;vgalu&lt;/a&gt;, please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussions the &lt;a href=&quot;http://groups.google.com/group/mongodb-user&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;mongodb-user group&lt;/a&gt; or &lt;a href=&quot;http://stackoverflow.com/questions/tagged/mongodb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Stack Overflow with the &lt;tt&gt;mongodb&lt;/tt&gt; tag&lt;/a&gt; are better forums. You can also see our &lt;a href=&quot;https://www.mongodb.org/about/support/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Technical Support page&lt;/a&gt; for additional support resources.&lt;/p&gt;

&lt;p&gt;If you test with a newer version of MongoDB and you believe you&apos;ve found a bug please post in this ticket and we&apos;ll reopen to investigate.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1550841" author="vgalu" created="Mon, 17 Apr 2017 19:46:42 +0000"  >&lt;p&gt;Hello &lt;span class=&quot;error&quot;&gt;&amp;#91;@alexander.gorrod@mongodb.com&amp;#93;&lt;/span&gt;, thank you for looking into this. I would like to kindly ask that you keep the ticket open for now. Your interepretation of the uploaded data is opposite of what we see on the test platform. During the dips our I/O is 0, while the CPU of the replica set primary is completely hogged. We have since run further tests that suggest an issue in the WiredTiger cache, shrinking it to 1GB and using all of the available RAM for the ZFS ARC, which sees hit rates of ~90% and ensures consistent performance across lengthy runs.&lt;/p&gt;

&lt;p&gt;It is possible that the included diagnostic information is not accurate, we are going to repeat the test and update the ticket, if that is fine by you.&lt;/p&gt;</comment>
                            <comment id="1550753" author="alexander.gorrod" created="Mon, 17 Apr 2017 18:32:12 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=weishan.ang%40gmail.com&quot; class=&quot;user-hover&quot; rel=&quot;weishan.ang@gmail.com&quot;&gt;weishan.ang@gmail.com&lt;/a&gt; I have taken a look at the data you uploaded, and it appears as though &lt;b&gt;your application is I/O bound&lt;/b&gt;. There are a number of times when all threads are waiting for either reads or writes. My experience is that often periods of slow/zero throughput correspond with I/O being overwhelmed.&lt;/p&gt;

&lt;p&gt;I can see in the diagnostic data that it is taking a long time for the storage engine to create a checkpoint (durable point in time snapshot on disk in the data files), which is another indication that I/O is overwhelmed. I would suggest trying with an even lower &lt;tt&gt;eviction_dirty_trigger=10&lt;/tt&gt; and &lt;tt&gt;eviction_dirty_target=7&lt;/tt&gt; since you have quite a large cache configured, and the cost of creating a checkpoint is proportional to the amount of &quot;dirty&quot; content in the cache.&lt;/p&gt;

&lt;p&gt;MongoDB has also been working hard to optimize for high-update throughput workloads, and so I would recommend upgrading to the latest version of MongoDB to see the benefit of those improvements.&lt;/p&gt;</comment>
                            <comment id="1533054" author="weishan" created="Mon, 27 Mar 2017 10:54:46 +0000"  >&lt;p&gt;Correction:&lt;/p&gt;

&lt;p&gt;We routinely see database operations drop from around 12000/s to 0 ops/s for several seconds at a time during what we believe to be WT cache eviction. After tuning the number of WT evictions threads to 8, eviction trigger(70), eviction target(68), eviction_dirty_trigger(20), eviction_dirty_target(18), we get a far more stable throughput of &lt;font color=&quot;red&quot;&gt;~8000&lt;/font&gt; ops/s with drops to 0 ops/s for only a second or so.&lt;/p&gt;</comment>
                            <comment id="1532378" author="weishan" created="Fri, 24 Mar 2017 18:28:39 +0000"  >&lt;p&gt;I have uploaded the logs to MongoDB portal at &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/6b355378-9cfe-4b04-a62f-84ccb64dd47b.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/6b355378-9cfe-4b04-a62f-84ccb64dd47b.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The uploaded file name is mongodb_upload_&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-28488&quot; title=&quot;WT Performance drops to zero during cache eviction with cache full&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-28488&quot;&gt;&lt;del&gt;WT-3236&lt;/del&gt;&lt;/a&gt;.zip&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="385494">SERVER-29311</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="154467" name="Screen Shot 2017-04-21 at 14.23.35.png" size="796696" author="domodwyer" created="Fri, 21 Apr 2017 13:56:05 +0000"/>
                            <attachment id="154468" name="Screen Shot 2017-04-21 at 14.47.22.png" size="556480" author="domodwyer" created="Fri, 21 Apr 2017 13:55:59 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 24 Mar 2017 18:42:16 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 40 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>backlog-server-pm</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 40 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>alexander.gorrod@mongodb.com</customfieldvalue>
            <customfieldvalue>domodwyer</customfieldvalue>
            <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>vgalu</customfieldvalue>
            <customfieldvalue>weishan</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht4q2f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsx5bb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrj05j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>