<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:21:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-29603] Dropping indexes sometimes locks our entire cluster</title>
                <link>https://jira.mongodb.org/browse/SERVER-29603</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;We have run into a situation where we have the same collection on many shards, but the definition of one of our indexes in this collection varies from shard to shard.  In one shard it is sparse and in the other it is not.&lt;br/&gt;
While this was our fault, this state blocks these collections from being balanced, as the balancer throws this error:&lt;br/&gt;
&quot;failed to create index before migrating data.  error: IndexOptionsConflict: Index with name: eIds.tId_1_eIds.v_-1 already exists with different options&quot;&lt;/p&gt;

&lt;p&gt;We are trying to remedy this by just dropping the index entirely and creating the index in the background (as a side note, why on earth do indexes create in the foreground by default).&lt;br/&gt;
This does work, however we have to repair something like 100 collections, and twice now within the first 5 drops, one of these drop indexes will completely lock up our system.&lt;/p&gt;

&lt;p&gt;I will try to reproduce this locally without the full production environment in an effort to see if this problem is tied to the indexes not matching between shards.&lt;/p&gt;</description>
                <environment>3.4.4 sharded cluster with 18 shards, each consisting of 1 replica, 1 primary, and 1 hidden replica. 3 config servers (CSRS) and 5 mongoS</environment>
        <key id="393425">SERVER-29603</key>
            <summary>Dropping indexes sometimes locks our entire cluster</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="kelsey.schubert@mongodb.com">Kelsey Schubert</assignee>
                                    <reporter username="glajchs">Scott Glajch</reporter>
                        <labels>
                    </labels>
                <created>Tue, 13 Jun 2017 17:07:05 +0000</created>
                <updated>Thu, 24 Aug 2017 04:23:50 +0000</updated>
                            <resolved>Sun, 16 Jul 2017 18:44:15 +0000</resolved>
                                    <version>3.2.13</version>
                    <version>3.4.4</version>
                                                    <component>WiredTiger</component>
                                        <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="1623115" author="ramon.fernandez" created="Sun, 16 Jul 2017 18:44:15 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=glajchs&quot; class=&quot;user-hover&quot; rel=&quot;glajchs&quot;&gt;glajchs&lt;/a&gt;, we haven&apos;t heard back from you for some time so I&apos;m resolving this ticket. If this is still an issue for you please provide the information requested above by Thomas so we can investigate further.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Ram&#243;n.&lt;/p&gt;</comment>
                            <comment id="1610523" author="thomas.schubert" created="Thu, 29 Jun 2017 15:47:44 +0000"  >&lt;p&gt;HI &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=glajchs&quot; class=&quot;user-hover&quot; rel=&quot;glajchs&quot;&gt;glajchs&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;We still need additional information to diagnose the problem. If this is still an issue for you, would you please upload the diagnostic.data to the portal I&apos;ve provided?&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1597296" author="thomas.schubert" created="Wed, 14 Jun 2017 21:08:45 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=glajchs&quot; class=&quot;user-hover&quot; rel=&quot;glajchs&quot;&gt;glajchs&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for taking the time to provide the mongod logs. Please note that diagnostic.data does not contain any customer information, &lt;br/&gt;
and you&apos;re welcome to review the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/r3.4.4/src/mongo/db/ftdc/ftdc_mongod.cpp#L307&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;source code&lt;/a&gt; to see what is collected. Essentially, it is the output of the following commands, plus some system metrics:&lt;/p&gt;

&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;serverStatus: db.serverStatus({tcmalloc: true})&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;replSetGetStatus: rs.status()&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;collStats for local.oplog.rs: db.getSiblingDB(&apos;local&apos;).oplog.rs.stats()&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;getCmdLineOpts: db.adminCommand({getCmdLineOpts: true})&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;buildInfo: db.adminCommand({buildInfo: true})&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;hostInfo: db.adminCommand({hostInfo: true})&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;Since you&apos;ve expressed concern about privacy, I&apos;ve created a secure &lt;a href=&quot;https://10gen-httpsupload.s3.amazonaws.com/upload_forms/d69956c5-a58c-4922-a4a1-9b67531f90bd.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;upload portal&lt;/a&gt; for you to provide files going forward. Files uploaded to this portal will only be visible to MongoDB employees investigating the issue and are routinely deleted after some time. &lt;/p&gt;

&lt;p&gt;Would you please upload the diagnostic.data so we can continue to debug this behavior?&lt;/p&gt;

&lt;p&gt;Thank you,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                            <comment id="1597094" author="glajchs" created="Wed, 14 Jun 2017 19:27:52 +0000"  >&lt;p&gt;Hi Thomas, thanks for your response.  It took me a little while to get the log files.  Basically we wanted to obscure customer data by way of database names as well as actual row identifiers.&lt;/p&gt;

&lt;p&gt;We&apos;ve had 2 events in production where this has happened.  The first event happened when we were on 3.2.13 (so you might want to update the affected versions bug attribute).  The second event happened a few hours after we finished our 3.4.4 upgrade.  We were hoping that this problem would have been gone with the 3.4.4 upgrade.  The logs are of the 1st event.  Sadly after anonymizing the logs I realized it was of the 1st event, and I didn&apos;t want to do that work all over again.  However the format of the logs seems to be essentially the same during both events.  Basically we had a list of collections which we knew were in the bad state (some of the shards had this index with the sparse flag and some without).  So we were going to go through and dropping the old index, wait 2 seconds, recreate the new index (with background: true), wait 2 seconds, and move onto the next collection.  The 2nd collection&apos;s drop index is what hung the server.&lt;/p&gt;

&lt;p&gt;In both cases, only one of our shards went to 100% for the duration of the problematic dropIndex (and only during the problematic dropIndex).  The first time (log attached), it was during the 2nd collection&apos;s drop index and the problem happened on shard12.  The second time, it was during the 7th collection&apos;s drop index and the problem happened on shard10.  In both cases only the primary CPU spiked.&lt;/p&gt;

&lt;p&gt;Unfortunately I&apos;m pretty sure we won&apos;t be able to attach the diagnostic.data file, as I think it contains customer information.  I&apos;ll keep researching the format of this file to see if there&apos;s a way I can provide it.  Also it seems that these diagnotic.data files are rotated and deleted after 1 week, so I only have the files for the 2nd event.  I have saved them off in case I can figure out a way to make it safe to share.&lt;/p&gt;

&lt;p&gt;I&apos;m also hoping to be able to reproduce with test data locally, but I&apos;m not sure I&apos;ll be able to get this to happen.&lt;/p&gt;</comment>
                            <comment id="1595788" author="thomas.schubert" created="Tue, 13 Jun 2017 17:27:33 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=glajchs&quot; class=&quot;user-hover&quot; rel=&quot;glajchs&quot;&gt;glajchs&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for the report. Would you please upload the diagnostic.data and mongod logs for an affected node, so we can investigate? &lt;/p&gt;

&lt;p&gt;I do not expect that this is related to the indexes not matching between shards, and the diagnostic.data and logs will likely allow us to determine the cause of this behavior.&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Thomas&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="158717" name="shard.log_trimmed" size="809909" author="glajchs" created="Wed, 14 Jun 2017 19:28:36 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 13 Jun 2017 17:27:33 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 30 weeks, 3 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>backlog-server-pm</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 30 weeks, 3 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>
            <customfieldvalue>glajchs</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht93sf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|ht14gv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;ol&gt;
	&lt;li&gt;Drop indexes&lt;/li&gt;
	&lt;li&gt;See if reads/writes are blocked&lt;/li&gt;
&lt;/ol&gt;
</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht8puv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>