<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:11:57 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-26386] Application Response Times Sky Rockets After Switching To A Freshly Upgraded 3.2.9 Primary Node - Cursor Exhaustion Spotted</title>
                <link>https://jira.mongodb.org/browse/SERVER-26386</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We have upgraded a production replica set from 3.0.12 to 3.2.9, secondary members first, leaving the two last members (primary and its datacenter active fallback) to the end of the process.&lt;/p&gt;

&lt;p&gt;The other day we&apos;ve changed priorities between the afore mentioned two remaining 3.0.12 members, upgraded one to 3.2.9 and then set it back as a primary node, leaving us with one secondary member left as 3.0.12&lt;/p&gt;

&lt;p&gt;Once that newly upgraded 3.2.9 host (172.16.144.3) became primary again (after DB restart, to have the new binaries apply), &lt;br/&gt;
Application using this replica set had their performance and response times were deteriorating rapidly.&lt;/p&gt;

&lt;p&gt;Node log file was showing &quot;cursorExhausted:1&quot; for almost every query logged,&lt;br/&gt;
which wasn&apos;t occurring in any of the other members.&lt;/p&gt;

&lt;p&gt;When examined server status for cursor metrics, number of timed out cursors was rising gradually while number of pinned cursors was around 100 and number of no timeout cursors was around 70.&lt;/p&gt;

&lt;p&gt;One thing that was changed apart from upgrading this primary node to 3.2.9 was to explicitly set its maximum configured WT cache size to 36Gb (based on the new 3.2 wiredTiger rule of thumb saying it&apos;s 60% of free physical RAM minus one Gb) &lt;/p&gt;

&lt;p&gt;We suspected that the fact that the recently upgraded 3.2.9 node was just restarted, thus having its cache empty - being completely &quot;cold&quot; when massive application traffic started performing reads and writes, was the root cause for these cursor exhaustion and performance drop - so we shortly after fallen back to the left 3.0.12 secondary (172.16.23.3) to become primary - which has resulted an immediate significant performance improvement and a stop to cursor exhaustion. &lt;/p&gt;

&lt;p&gt;Please note that on the 3.0.12 we had no explicit configuration of cacheSizeGB but left it to the default behaviour of wiredTiger 3.2.9.&lt;/p&gt;

&lt;p&gt;After primary was set on the last remaining 3.0.12, we decided to have leave the 3.2.9 node which failed to take the load as primary (172.16.144.3) to &quot;pre heat&quot; its cache for on query traffic (about 600 read statements per second) and gave it another go as primary the day after (without restarting it, as it was already set on version 3.2.9 - yet the same behaviour of cursor exhaustion and massive app performance drop occurred again forcing yet another fallback to the 3.0.12 node to become primary again - which again mitigated things back to what they were before the change.&lt;/p&gt;

&lt;p&gt;Enclosed please find are log files from both the 3.2.9 and 3.0.12 nodes, notice how same queries generate different outcomes in terms of cursor exhaustion even though they share the same execution plans, one extremely popular query is the one issuing find on lists.items-Posts in its different permutations:&lt;/p&gt;

&lt;p&gt;329_member.log.tar.gz: for host 172.16.144.3, I&apos;ve omitted the biggest log (1.5G - which covered Sep 29 09:30 to 12:43) as other logs also contain the symptoms reported here.&lt;/p&gt;

&lt;p&gt;3012_member.log.tar.gz: for host 172.16.23.3&lt;/p&gt;

&lt;p&gt;329_member_server_status.out&lt;br/&gt;
3012_member_server_status.out&lt;/p&gt;

&lt;p&gt;We have other applications based on other replica sets which are fully upgraded to 3.2.9 (including primary of course)  which doesn&apos;t display this behaviour - so this could very much have to do with the way the application driver is setup or simply on how its written (or a mix of both...)&lt;/p&gt;

&lt;p&gt;Kindly try and assist in analysing how come this behaviour occurs and recommend of methods to try and overcome it.&lt;br/&gt;
This is quite urgent for us as we wish to have it completed by the end of next week.&lt;/p&gt;

&lt;p&gt;Many thanks in advance, &lt;br/&gt;
Avi K, DBA&lt;br/&gt;
WiX.COM&lt;/p&gt;</description>
                <environment>We are running on Debian 7&lt;br/&gt;
</environment>
        <key id="319829">SERVER-26386</key>
            <summary>Application Response Times Sky Rockets After Switching To A Freshly Upgraded 3.2.9 Primary Node - Cursor Exhaustion Spotted</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.mongodb.org/images/icons/priorities/critical.svg">Critical - P2</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="kelsey.schubert@mongodb.com">Kelsey Schubert</assignee>
                                    <reporter username="avrahamk">Avraham Kalvo</reporter>
                        <labels>
                    </labels>
                <created>Thu, 29 Sep 2016 13:07:35 +0000</created>
                <updated>Sun, 20 Nov 2016 12:31:13 +0000</updated>
                            <resolved>Sun, 20 Nov 2016 12:31:13 +0000</resolved>
                                    <version>3.2.9</version>
                                                    <component>Concurrency</component>
                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="1438420" author="avrahamk" created="Sun, 20 Nov 2016 11:49:28 +0000"  >&lt;p&gt;You may close this service request now, &lt;br/&gt;
I&apos;ve opened a new ticket as per performance:&lt;br/&gt;
&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-27132&quot; title=&quot;Mongo response time increased in 15-20% after upgrade to 3.2.10&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-27132&quot;&gt;&lt;del&gt;SERVER-27132&lt;/del&gt;&lt;/a&gt;&lt;br/&gt;
Mongo response time increased in 15-20% after upgrade to 3.2.10&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Avi K  &lt;/p&gt;</comment>
                            <comment id="1434456" author="avrahamk" created="Tue, 15 Nov 2016 15:39:41 +0000"  >&lt;p&gt;Many thanks,&lt;br/&gt;
Will do and update here - so you can close this one.&lt;/p&gt;</comment>
                            <comment id="1434356" author="thomas.schubert" created="Tue, 15 Nov 2016 14:31:26 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=avrahamk&quot; class=&quot;user-hover&quot; rel=&quot;avrahamk&quot;&gt;avrahamk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Would you please open a new ticket with the &lt;tt&gt;diagnostic.data&lt;/tt&gt; and more details about what you are observing?&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1434208" author="avrahamk" created="Tue, 15 Nov 2016 10:02:02 +0000"  >&lt;p&gt;Thanks Thomas,&lt;/p&gt;

&lt;p&gt;Actually we have, &lt;br/&gt;
I&apos;m currently investigating a specific case which presents performance degradation after upgrading to 3.2.10.&lt;/p&gt;

&lt;p&gt;May I share this here or log in a new ticket?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Avi K.&lt;/p&gt;</comment>
                            <comment id="1433893" author="thomas.schubert" created="Mon, 14 Nov 2016 22:49:44 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=avrahamk&quot; class=&quot;user-hover&quot; rel=&quot;avrahamk&quot;&gt;avrahamk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;I wanted to check in regarding your recent performance. Have you encountered any performance issues since your upgrade to MongoDB 3.2.10?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1400643" author="avrahamk" created="Wed, 5 Oct 2016 12:15:13 +0000"  >&lt;p&gt;diagnostic data directory for newly upgraded primary instance&lt;/p&gt;</comment>
                            <comment id="1400607" author="avrahamk" created="Wed, 5 Oct 2016 11:36:26 +0000"  >&lt;p&gt;log file for newly upgraded instance&lt;/p&gt;</comment>
                            <comment id="1400606" author="avrahamk" created="Wed, 5 Oct 2016 11:36:04 +0000"  >&lt;p&gt;Thanks Thomas, &lt;/p&gt;

&lt;p&gt;We&apos;ve upgraded the replica set to 3.2.10, left just one to be set on 3.0.12 again (for fall back) &lt;br/&gt;
and switched primary member to the newly upgraded node.&lt;/p&gt;

&lt;p&gt;The previous symptoms in terms of performance and user experience drops didn&apos;t reoccur as for now, &lt;br/&gt;
yet signs of vast cursor exhaustion is still visible whilst examining the instance log file (enclosed please find).&lt;/p&gt;

&lt;p&gt;As such cursor exhaustions were not apparent for this instance prior to when it was upgraded from 3.0.12 to 3.2.9 and then 3.2.10, We are concerned that this might gradually build up to a state of denial of service as we experienced more rapidly last week, which brought to the opening of this support ticket eventually.&lt;/p&gt;

&lt;p&gt;Kindly review the new log file enclosed here, as well as the new diagnostic directory for this instance.&lt;/p&gt;

&lt;p&gt;Bear in mind this member was again restarted upon upgrade.&lt;/p&gt;

&lt;p&gt;Many thanks,&lt;br/&gt;
Avi Kalvo&lt;br/&gt;
WiX.COM&lt;/p&gt;</comment>
                            <comment id="1398366" author="thomas.schubert" created="Sat, 1 Oct 2016 15:54:30 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=avrahamk&quot; class=&quot;user-hover&quot; rel=&quot;avrahamk&quot;&gt;avrahamk&lt;/a&gt;,&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;1.Would you recommend removing WT cache limit configuration for a host where a single instance is running? might this assist to eliminate cursor exahustion?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;No, we often see that increasing the WT cache limit has a negative performance impact. There is less space for the filesystem cache, and if the WT cache fills the same eviction issues are likely to manifest as before.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;2. What specific patterns in the workload has caused the unoptimised WT handling? are there things in the app workload that we can change/fix/improve in order to improve the situation to begin with? Can you pinpoint specific statements / configurations that we can try to avoid/bypass?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;You can read about the improvements included in MongoDB 3.2.10 in this &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-26055?focusedCommentId=1394968&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1394968&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;comment&lt;/a&gt;. If this issue is still present after upgrading to MongoDB 3.2.10, we&apos;ll continue to investigate the root cause.&lt;/p&gt;

&lt;p&gt;Please note that a general question about possible optimizations may be best suited for &lt;a href=&quot;http://groups.google.com/group/mongodb-user&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;mongodb-users group&lt;/a&gt;, which provides a space for MongoDB-related support.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;3. When is 3.2.10 supposed to be released?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;It&apos;s available now: see our &lt;a href=&quot;https://www.mongodb.com/download-center&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;downloads page&lt;/a&gt;. Please upgrade to MongoDB 3.2.10, and let us if it has resolved the issue.&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1398340" author="avrahamk" created="Sat, 1 Oct 2016 07:59:07 +0000"  >&lt;p&gt;Thanks for you prompt response Thomas,&lt;br/&gt;
Three things i have to say in light of your latest feedback, if you can relate to each please:&lt;/p&gt;

&lt;p&gt;1.Would you recommend removing WT cache limit  configuration for a host where a single instance is running? might this assist to eliminate cursor exahustion?&lt;/p&gt;

&lt;p&gt;2. What specific patterns in the workload has caused the unoptimised WT handling? are there things in the app workload that we can change/fix/improve in order to improve the situation to begin with? Can  you pinpoint specific statements / configurations that we can try to avoid/bypass?&lt;/p&gt;

&lt;p&gt;3. When is 3.2.10 supposed to be released?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Avi K&lt;br/&gt;
WiX &lt;/p&gt;</comment>
                            <comment id="1397342" author="thomas.schubert" created="Fri, 30 Sep 2016 05:03:51 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=avrahamk&quot; class=&quot;user-hover&quot; rel=&quot;avrahamk&quot;&gt;avrahamk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;We have analyzed the log files you uploaded, and it appears as though the cause is that WiredTiger cache management is not able to keep up with the workload. We have been working hard on improving the efficiency of WiredTiger cache management since the 3.2.9 release. The particular characteristics of the data you uploaded don&apos;t match workloads we have been optimizing, but there is still a good chance that the upcoming 3.2.10 release will fix the performance degradation you are seeing. Would you please try upgrading again once the new release is available?&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1396602" author="avrahamk" created="Thu, 29 Sep 2016 14:37:02 +0000"  >&lt;p&gt;Enclosed please find, as per you request.&lt;/p&gt;</comment>
                            <comment id="1396589" author="thomas.schubert" created="Thu, 29 Sep 2016 14:24:03 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=avrahamk&quot; class=&quot;user-hover&quot; rel=&quot;avrahamk&quot;&gt;avrahamk&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thanks for opening this ticket. We are investigating the issue. To help, would you please provide an archive of the &lt;tt&gt;diagnostic.data&lt;/tt&gt; directory for the 3.2.9 node?&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="333123">SERVER-27132</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="139981" name="3012_member.log.tar.gz" size="19305733" author="avrahamk" created="Thu, 29 Sep 2016 13:07:35 +0000"/>
                            <attachment id="139979" name="3012_member_server_status.out" size="14628" author="avrahamk" created="Thu, 29 Sep 2016 13:07:35 +0000"/>
                            <attachment id="140584" name="3210_primary_cursor_exahustion_log.tar.gz" size="10842681" author="avrahamk" created="Wed, 5 Oct 2016 11:36:26 +0000"/>
                            <attachment id="139982" name="329_member.log.tar.gz" size="134438279" author="avrahamk" created="Thu, 29 Sep 2016 13:07:35 +0000"/>
                            <attachment id="139996" name="329_member_diagnostic_data.tar.gz" size="41311916" author="avrahamk" created="Thu, 29 Sep 2016 14:37:02 +0000"/>
                            <attachment id="139980" name="329_member_server_status.out" size="19561" author="avrahamk" created="Thu, 29 Sep 2016 13:07:35 +0000"/>
                            <attachment id="140585" name="diagnostic.data.tar.gz" size="96846282" author="avrahamk" created="Wed, 5 Oct 2016 12:15:13 +0000"/>
                            <attachment id="140701" name="s26386.png" size="52800" author="alexander.gorrod@mongodb.com" created="Thu, 6 Oct 2016 01:09:54 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 29 Sep 2016 14:24:03 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        7 years, 12 weeks, 3 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            7 years, 12 weeks, 3 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>avrahamk</customfieldvalue>
            <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrjuo7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsqaq7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;Upgrade primary to 3.2.9 &lt;br/&gt;
Configure WT cache to a predefined value&lt;/p&gt;</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsehv3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>