<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:01:01 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-22642] WiredTiger engine resync stalls with a lot of tables/indexes</title>
                <link>https://jira.mongodb.org/browse/SERVER-22642</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We are trying to upgrade a replica set with many collections and indexes to WiredTiger. The replica set has 30k collections and 14 indexes per collection.&lt;/p&gt;

&lt;p&gt;The initial data sync and index build works fine but the server makes very little progress once it starts to apply the replication log. It seems to make no progress for ~15 minutes during this time it consumes a full core of CPU and does very little I/O. It will does a burst of I/O for a minute or so before falling back to consuming a lot of CPU.&lt;/p&gt;

&lt;p&gt;Based on the attached perf profile it looks like most of the CPU is being consumed by the eviction thread. (Possibly during a checkpoint?)&lt;/p&gt;

&lt;p&gt;Let me know if there is any additional information I can provide to help track this down.&lt;/p&gt;</description>
                <environment></environment>
        <key id="265831">SERVER-22642</key>
            <summary>WiredTiger engine resync stalls with a lot of tables/indexes</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="kelsey.schubert@mongodb.com">Kelsey Schubert</assignee>
                                    <reporter username="bpot">Bob Potter</reporter>
                        <labels>
                    </labels>
                <created>Tue, 16 Feb 2016 19:43:52 +0000</created>
                <updated>Mon, 18 Apr 2016 20:18:02 +0000</updated>
                            <resolved>Mon, 18 Apr 2016 20:18:02 +0000</resolved>
                                    <version>3.0.9</version>
                    <version>3.2.1</version>
                                                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>11</watches>
                                                                                                                <comments>
                            <comment id="1239273" author="thomas.schubert" created="Mon, 18 Apr 2016 20:18:02 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for uploading the diagnostic.data. I have confirmed that the original issue appears to have been resolved by &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-22209&quot; title=&quot;Collection creation during final phase of checkpoint holds database lock for extended time&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-22209&quot;&gt;&lt;del&gt;SERVER-22209&lt;/del&gt;&lt;/a&gt; / &lt;a href=&quot;https://jira.mongodb.org/browse/WT-2346&quot; title=&quot;Don&amp;#39;t hold schema lock during checkpoint I/O&quot; class=&quot;issue-link&quot; data-issue-key=&quot;WT-2346&quot;&gt;&lt;del&gt;WT-2346&lt;/del&gt;&lt;/a&gt;, so I will be closing this ticket as a duplicate.&lt;/p&gt;

&lt;p&gt;The new behavior that you are observing may be explained by &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-22906&quot; title=&quot;MongoD uses excessive memory over and above the WiredTiger cache size&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-22906&quot;&gt;&lt;del&gt;SERVER-22906&lt;/del&gt;&lt;/a&gt;. However, if this is still an issue for you, please open a new ticket and we will continue to investigate.&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1236096" author="bpot" created="Thu, 14 Apr 2016 17:33:36 +0000"  >&lt;p&gt;Hi Thomas,&lt;/p&gt;

&lt;p&gt;By dropping the cache size from the default of 17GB to 8GB I&apos;ve been able to avoid the out of memory errors (for now at least! &amp;#8211; it&apos;s still slowly growing). But, it looks like replication is still having trouble catching up with the primary. I&apos;ve uploaded the diagnostic data and also a cpu profile I created using `perf`. It still seems to be using a lot of cpu during checkpointing but the profile looks different.&lt;/p&gt;

&lt;p&gt;Thanks for looking at this.&lt;/p&gt;</comment>
                            <comment id="1232733" author="thomas.schubert" created="Tue, 12 Apr 2016 01:45:42 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;I&apos;m sorry you are still encountering issues. Please upload the diagnostic.data and so we can investigate this new behavior. After we have a chance to examine the diagnostic.data, we can determine whether a new ticket would be appropriate.&lt;/p&gt;

&lt;p&gt;Thank you for your help,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1232680" author="bpot" created="Mon, 11 Apr 2016 23:52:54 +0000"  >&lt;p&gt;Hi Thomas,&lt;/p&gt;

&lt;p&gt;I&apos;ve tried a couple syncs with a recent nightly build (3.2.4-105-g73290d0). I think I&apos;m seeing a different problem now. Once the initial sync+index build process finishes the mongod process starts using a lot of memory and either brings down the instance or is killed by OOM. Would it be helpful if I uploaded a copy of diagnostic.data from that failed syncs? Should I open a new issue for this?&lt;/p&gt;

&lt;p&gt;It&apos;s unclear if this issue is fixed or not.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Bob&lt;/p&gt;</comment>
                            <comment id="1225787" author="thomas.schubert" created="Tue, 5 Apr 2016 06:14:41 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for uploading the log. From our analysis, it appears that WiredTiger checkpoints are completing very slowly once the server starts to apply the replication log. In MongoDB 3.2.4, WiredTiger holds a lock for much of the time it is doing a checkpoint, and any collection create or drop also needs the same lock. So long running checkpoints will make any workload that creates or drops collections or indexes stall. A recent improvement have been made to WiredTiger so a checkpoint doesn&apos;t take the lock needed to create/drop tables (&lt;a href=&quot;https://jira.mongodb.org/browse/WT-2346&quot; title=&quot;Don&amp;#39;t hold schema lock during checkpoint I/O&quot; class=&quot;issue-link&quot; data-issue-key=&quot;WT-2346&quot;&gt;&lt;del&gt;WT-2346&lt;/del&gt;&lt;/a&gt;). This improvement will be included in MongoDB 3.2.5.&lt;/p&gt;

&lt;p&gt;Please upgrade to MongoDB 3.2.5 when it is released and report back if the issue persists.&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1195686" author="bpot" created="Mon, 7 Mar 2016 21:52:20 +0000"  >&lt;p&gt;I&apos;ve uploaded the logs to the private upload portal. I ran the fruitsalad script on it and removed all lines which contained &apos;failed to apply update&apos; since they still included some of the update commands unredacted.&lt;/p&gt;</comment>
                            <comment id="1192235" author="thomas.schubert" created="Thu, 3 Mar 2016 19:15:40 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;We will need the complete log covering the sync attempt. As Ramon mentioned, these files will be only visible to MongoDB employees investigating this issue and are routinely deleted after some time.&lt;/p&gt;

&lt;p&gt;If you are unable to provide these logs because of sensitive customer data, we will wait for the redacted logs. You may find this &lt;a href=&quot;https://github.com/rueckstiess/fruitsalad&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;script&lt;/a&gt; useful to redact customer data.&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1187624" author="ramon.fernandez" created="Mon, 29 Feb 2016 13:52:29 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;, I&apos;ve created a &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/d3b85993-5084-4814-a6c3-33c3e2cc5fcc.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;private upload portal&lt;/a&gt; so any data you upload will only be visible to MongoDB engineers. Will discuss with Thomas later and see what pieces we&apos;ll need first.&lt;/p&gt;</comment>
                            <comment id="1187443" author="bpot" created="Mon, 29 Feb 2016 09:55:03 +0000"  >&lt;p&gt;Hi Thomas,&lt;/p&gt;

&lt;p&gt;I&apos;m looking at the logs and they contain private customer information. Most notably from &apos;failed to apply update&apos; log records. So I can&apos;t share them in total. I could try to strip log entries but it is 200MB worth of logs.&lt;/p&gt;

&lt;p&gt;Is there log information from a particular subsystem that would be most useful?&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Bob &lt;/p&gt;</comment>
                            <comment id="1181009" author="thomas.schubert" created="Mon, 22 Feb 2016 20:08:25 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;We are examining the diagnostic data you have uploaded. So we can get a better idea of what is going on here, can you please upload the logs covering the duration of sync attempt as well?&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1180857" author="bpot" created="Mon, 22 Feb 2016 18:35:10 +0000"  >&lt;p&gt;Ramon,&lt;/p&gt;

&lt;p&gt;I&apos;ve added the diagnostic data for the most recent sync attempt on 3.2.1.&lt;/p&gt;

&lt;p&gt;Yeah, we&apos;re away that our model may not be a great fit for WiredTiger.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Bob&lt;/p&gt;</comment>
                            <comment id="1175027" author="ramon.fernandez" created="Wed, 17 Feb 2016 00:19:43 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=bpot&quot; class=&quot;user-hover&quot; rel=&quot;bpot&quot;&gt;bpot&lt;/a&gt;, if the affected secondary is running 3.2.1, can you please upload the contents of the &lt;tt&gt;diagnostic.data&lt;/tt&gt; directory in your &lt;tt&gt;dbpath&lt;/tt&gt; to this ticket? That should help us investigate this issue.&lt;/p&gt;

&lt;p&gt;Please note that your data distribution will require over 400k files when using WiredTiger, so this may cause other issues as well.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="258996">SERVER-22209</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="190912">SERVER-17675</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="110111" name="diagnostic.data.tar.gz" size="22507101" author="bpot" created="Mon, 22 Feb 2016 18:33:54 +0000"/>
                            <attachment id="117798" name="diagnostic.data_3.2.4-105-g73290d0.tar.bz2" size="24524205" author="bpot" created="Thu, 14 Apr 2016 17:27:54 +0000"/>
                            <attachment id="109253" name="mongo_32_perf.txt" size="575880" author="bpot" created="Tue, 16 Feb 2016 19:43:52 +0000"/>
                            <attachment id="117797" name="perf_report_3.2.4-105-g73290d0.txt.gz" size="703947" author="bpot" created="Thu, 14 Apr 2016 17:27:54 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 16 Feb 2016 21:37:33 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        7 years, 43 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            7 years, 43 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>bpot</customfieldvalue>
            <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrkgp3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hsigmv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hsf8hb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>