<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:48:05 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-38147] Cap donor migration lock acquisition stalls in the presence of active transactions</title>
                <link>https://jira.mongodb.org/browse/SERVER-38147</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Today&#8217;s chunk migration behaves like any other collection DDL operation and uses exclusive collection lock for synchronization. Because of the fairness of the lock manager collection locks, in the presence of multi-statement transactions, these X-lock acquisitions have the potential to stall access to the entire collection for up to the transaction timeout (which defaults to 1 minute). What is worse is that as opposed to DDL, chunk migration is not user-initiated and customers have no control over it apart from disabling the balancer.&lt;/p&gt;

&lt;p&gt;Since this is not acceptable and may lead to outages, we will cap all migration-related lock acquisition stalls to at most some configurable parameter value, defaulted to 500 milliseconds.&lt;/p&gt;

&lt;p&gt;On the donor side, there are 3 state transitions, which use the collection X lock to synchronize with concurrent workload, which can lead to the described stalls:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Starting the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/fd7dfafb0ec103ac9a1c02489e8a99b32c509bdd/src/mongo/db/s/migration_source_manager.cpp#L249&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;clone&lt;/a&gt;&#160;phase&lt;/li&gt;
	&lt;li&gt;Entering the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/fd7dfafb0ec103ac9a1c02489e8a99b32c509bdd/src/mongo/db/s/collection_sharding_runtime.cpp#L185&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;catch-up (read-only)&lt;/a&gt; phase of the critical section&lt;/li&gt;
	&lt;li&gt;Entering the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/fd7dfafb0ec103ac9a1c02489e8a99b32c509bdd/src/mongo/db/s/collection_sharding_runtime.cpp#L196&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;commit&lt;/a&gt; phase of the critical section&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;As part of this ticket, we should implement the following:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;Startup and runtime-configurable parameter called &lt;tt&gt;migrationLockAcquisitionMaxWaitMS&lt;/tt&gt;, defaulted to 500 msec&lt;/li&gt;
	&lt;li&gt;Change all the usages of AutoGetCollection in the three locations above to pass a &lt;a href=&quot;https://github.com/mongodb/mongo/blob/fd7dfafb0ec103ac9a1c02489e8a99b32c509bdd/src/mongo/db/catalog_raii.h#L94&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;deadline&lt;/a&gt; of &lt;tt&gt;now() + migrationLockAcquisitionMaxWaitMS&lt;/tt&gt;&lt;/li&gt;
	&lt;li&gt;Run tests multiple time in order to make sure that they don&apos;t fail on slower machines and if need be up the &lt;tt&gt;migrationLockAcquisitionMaxWaitMS&lt;/tt&gt; default to 30 seconds if test commands are enabled.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="634354">SERVER-38147</key>
            <summary>Cap donor migration lock acquisition stalls in the presence of active transactions</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="kimberly.tao@mongodb.com">Kim Tao</assignee>
                                    <reporter username="kaloian.manassiev@mongodb.com">Kaloian Manassiev</reporter>
                        <labels>
                            <label>newgrad</label>
                    </labels>
                <created>Thu, 15 Nov 2018 12:11:54 +0000</created>
                <updated>Sun, 29 Oct 2023 22:26:32 +0000</updated>
                            <resolved>Thu, 24 Jan 2019 21:47:09 +0000</resolved>
                                                    <fixVersion>4.1.8</fixVersion>
                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="3851518" author="esha.maharishi@10gen.com" created="Tue, 1 Jun 2021 16:00:49 +0000"  >&lt;p&gt;I see, it was because even read-only transactions used MODE_IX (&quot;write&quot;) locks (I&apos;m not sure if read-only transactions still use MODE_IX locks). Thank you!&lt;/p&gt;</comment>
                            <comment id="3849958" author="kaloian.manassiev" created="Mon, 31 May 2021 16:34:20 +0000"  >&lt;p&gt;At the time when I wrote this response, transactions were using MODE_IX for locks, even if they were read-only transactions. The mode of the global lock is what we use as a means &lt;a href=&quot;https://github.com/mongodb/mongo/blob/5aeade8f1ed4a895f935dc7c5dbec506f837d63d/src/mongo/db/s/collection_sharding_runtime.cpp#L340&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;to decide&lt;/a&gt; whether to apply the critical section for an operation or not. This means that for transactions, even if we are in the read-only/catch-up phase, it will still block behind the CS (rather it will abort the transaction).&lt;/p&gt;</comment>
                            <comment id="3837402" author="esha.maharishi@10gen.com" created="Tue, 25 May 2021 14:08:44 +0000"  >&lt;blockquote&gt;&lt;p&gt;There cannot be any active transactions on this collection once we enter the read-only part of the critical section.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=kaloian.manassiev&quot; class=&quot;user-hover&quot; rel=&quot;kaloian.manassiev&quot;&gt;kaloian.manassiev&lt;/a&gt;, did you mean there cannot be active transactions once we enter the&#160;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/6d2e673bd7c69aa0de24ba3ce1ac3aa1c71343be/src/mongo/db/s/collection_sharding_runtime.cpp#L455&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;commit phase&lt;/a&gt; of the critical section? During the read-only phase&#160;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/6d2e673bd7c69aa0de24ba3ce1ac3aa1c71343be/src/mongo/db/s/collection_sharding_runtime.cpp#L434&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;(catch-up phase)&lt;/a&gt;, I think new read-only transactions can still be started.&lt;/p&gt;

&lt;p&gt;I still agree that the timeout is not needed on the lock acquisitions in the commit phase, since there cannot be active transactions once in the commit phase.&lt;/p&gt;</comment>
                            <comment id="2126589" author="xgen-internal-githook" created="Thu, 24 Jan 2019 21:45:57 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;email&apos;: &apos;kimberly.tao@mongodb.com&apos;, &apos;name&apos;: &apos;Kim Tao&apos;, &apos;username&apos;: &apos;Kimchelly&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-38147&quot; title=&quot;Cap donor migration lock acquisition stalls in the presence of active transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-38147&quot;&gt;&lt;del&gt;SERVER-38147&lt;/del&gt;&lt;/a&gt;: cap donor migration lock acquisition stalls in the presence of active transactions&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/ae2607974156a4141cceec7b682418d57057e89e&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/ae2607974156a4141cceec7b682418d57057e89e&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2075490" author="kaloian.manassiev" created="Thu, 29 Nov 2018 15:59:58 +0000"  >&lt;p&gt;Yes, these two are part of the &quot;cleanup&quot; and that&apos;s why I didn&apos;t account for them. There cannot be any active transactions on this collection once we enter the read-only part of the critical section.&lt;/p&gt;</comment>
                            <comment id="2075483" author="esha.maharishi@10gen.com" created="Thu, 29 Nov 2018 15:55:52 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=kaloian.manassiev&quot; class=&quot;user-hover&quot; rel=&quot;kaloian.manassiev&quot;&gt;kaloian.manassiev&lt;/a&gt;, there are also the two collection X lock acquisitions to refresh the CSS after the migration commit (if the remote&#160;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/dcf7e0dd89d34f58b592f1adb3d41e5edd6e2012/src/mongo/db/s/migration_source_manager.cpp#L472&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;refresh succeeds&lt;/a&gt;&#160;and or if the remote&#160;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/dcf7e0dd89d34f58b592f1adb3d41e5edd6e2012/src/mongo/db/s/migration_source_manager.cpp#L481-L483&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;refresh fails&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;However, these acquisitions are inside the critical section - is that why they don&apos;t need the deadline?&lt;/p&gt;

&lt;p&gt;Note that the &quot;remote refresh succeeds&quot; acquisition is actually inside forceShardFilteringMetadataRefresh() (which we are not going to initially put a timeout on), and the &quot;remote refresh fails&quot; one is under an UninterruptibleLockGuard.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                        <issuelink>
            <issuekey id="674873">SERVER-39091</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="514197">SERVER-34018</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 29 Nov 2018 15:55:52 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        2 years, 36 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[<s><a href='https://jira.mongodb.org/browse/SERVER-39091'>SERVER-39091</a></s>]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-1286</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            2 years, 36 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>esha.maharishi@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>kaloian.manassiev@mongodb.com</customfieldvalue>
            <customfieldvalue>kimberly.tao@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hucyvb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr85qf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2640">Sharding 2018-12-31</customfieldvalue>
    <customfieldvalue id="2725">Sharding 2019-01-14</customfieldvalue>
    <customfieldvalue id="2726">Sharding 2019-01-28</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hucl4n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>