<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:54:29 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-62213] Investigate presence of multiple migration coordinator documents</title>
                <link>https://jira.mongodb.org/browse/SERVER-62213</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;It has been observed on a cluster the presence of 4 migration coordinator documents on one shard that led to hit &lt;a href=&quot;https://github.com/mongodb/mongo/blob/70af526008b1e008e52369dca0aae64d48005bef/src/mongo/db/s/migration_util.cpp#L892-L895&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this invariant&lt;/a&gt; on step-up.&lt;/p&gt;

&lt;p&gt;The documents were all relative to migrations for different namespaces and the states were:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;2 aborted&lt;/li&gt;
	&lt;li&gt;1 committed&lt;/li&gt;
	&lt;li&gt;1 without decision&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The range deletions seemed to have been correctly handled both on donor and recipients:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;No range deletion documents for the aborted migrations (range deletion tasks already executed)&lt;/li&gt;
	&lt;li&gt;Ready range deletion task on the donor for the committed migration&lt;/li&gt;
	&lt;li&gt;Pending range deletions on donor/receiver for the migration without decision&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Given the state of &quot;decided&quot; migrations, we can consider that:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/19080405d9c7d08b54e70228a92d237223f2885c/src/mongo/db/s/migration_coordinator.cpp#L248&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;_abortMigrationOnDonorAndRecipient&lt;/a&gt; worked well.&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/19080405d9c7d08b54e70228a92d237223f2885c/src/mongo/db/s/migration_coordinator.cpp#L188&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;_commitMigrationOnDonorAndRecipient&lt;/a&gt; worked well.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;It is then very likely that something odd happened right after, as part of the call to &lt;a href=&quot;https://github.com/mongodb/mongo/blob/19080405d9c7d08b54e70228a92d237223f2885c/src/mongo/db/s/migration_coordinator.cpp#L183&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;forgetMigration&lt;/a&gt; that did not remove the migration coordinators.&lt;/p&gt;</description>
                <environment></environment>
        <key id="1955134">SERVER-62213</key>
            <summary>Investigate presence of multiple migration coordinator documents</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="pierlauro.sciarelli@mongodb.com">Pierlauro Sciarelli</assignee>
                                    <reporter username="pierlauro.sciarelli@mongodb.com">Pierlauro Sciarelli</reporter>
                        <labels>
                    </labels>
                <created>Tue, 21 Dec 2021 18:49:25 +0000</created>
                <updated>Mon, 21 Mar 2022 16:38:02 +0000</updated>
                            <resolved>Thu, 23 Dec 2021 11:15:29 +0000</resolved>
                                    <version>5.0.5</version>
                    <version>5.1.1</version>
                                                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="4265440" author="pierlauro.sciarelli" created="Thu, 23 Dec 2021 08:36:53 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=jordi.serra-torrens&quot; class=&quot;user-hover&quot; rel=&quot;jordi.serra-torrens&quot;&gt;jordi.serra-torrens&lt;/a&gt; correctly pointed out that the failure may be coming from the wait for majority committing the vector clock&apos;s config time as part of deleteMigrationCoordinatorDocumentLocally. This clearly explains why the delete of the migration coordinator was not served: because it was not reached.&lt;/p&gt;</comment>
                            <comment id="4265426" author="pierlauro.sciarelli" created="Thu, 23 Dec 2021 08:09:42 +0000"  >&lt;p&gt;The leaked coordinators were not deleted after delivering a decision due to a &lt;tt&gt;WriteConcernFailed (&quot;waiting for replication timed out&quot;)&lt;/tt&gt; exception: my interpretation is that since the node was under a lot of pressure, it was probably not possible to instantly commit the deletion locally and that caused a failure of &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b169a72cd2f2d36a2059af2f484cfcca248a7e1c/src/mongo/db/s/migration_util.cpp#L833&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this very restrictive write concern&lt;/a&gt; (locally commit with a timeout of 0 seconds). This is odd because because, assuming the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b169a72cd2f2d36a2059af2f484cfcca248a7e1c/src/mongo/db/persistent_task_store.h#L106&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;persistent task store&apos;s remove&lt;/a&gt; is honouring the &lt;a href=&quot;https://docs.mongodb.com/v5.0/reference/write-concern/#wtimeout&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;wtimeout documentation&lt;/a&gt;, setting a timeout of 0 seconds must mean no timeout at all.&lt;/p&gt;

&lt;p&gt;While further investigating this problem, there seems also to be a &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b169a72cd2f2d36a2059af2f484cfcca248a7e1c/src/mongo/db/s/migration_source_manager.cpp#L747-L758&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;way too optimistic logic&lt;/a&gt; associated with the failure of a migration:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Log a warning&lt;/li&gt;
	&lt;li&gt;Clear the filtering metadata for the migration&apos;s namespace&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;It basically assumes that:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The exception was due to the primary stepping down&lt;/li&gt;
	&lt;li&gt;The migration will be resumed to the next step-up&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;As it has been observed that other exceptions could break those assumptions, it would be reasonable to enrich the catch body to check for the kind of error and react accordingly (e.g. if the exception is not due to stepdown, handle the scenario differently).&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1955958">SERVER-62245</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1955856">SERVER-62243</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[5002K000012RaYCQA0]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        2 years, 6 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            2 years, 6 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>pierlauro.sciarelli@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0fd13:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hzykgv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="5430">Sharding EMEA 2021-12-27</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0ez6f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>