<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:17:41 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-48641] Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out</title>
                <link>https://jira.mongodb.org/browse/SERVER-48641</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The MigrationDestinationManager &lt;a href=&quot;https://github.com/mongodb/mongo/blob/d1f007a30fc2be7faf66747b8bf3fc13c7c41938/src/mongo/db/s/migration_destination_manager.cpp#L779&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;checks-out a session&lt;/a&gt; and then proceeds executing the recipient logic while that session is checked-out.&lt;/p&gt;

&lt;p&gt;The execution logic at some point reaches to a call to &lt;a href=&quot;https://github.com/mongodb/mongo/blob/d1f007a30fc2be7faf66747b8bf3fc13c7c41938/src/mongo/db/s/migration_destination_manager.cpp#L162&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;&lt;tt&gt;waitForWriteConcern&lt;/tt&gt;&lt;/a&gt; which runs with the session still checked-out.&lt;/p&gt;

&lt;p&gt;Because the JournalFlusher wait is non-interruptible (and also because &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-40081&quot; title=&quot;Move session checkout to before command execution&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-40081&quot;&gt;&lt;del&gt;SERVER-40081&lt;/del&gt;&lt;/a&gt; prohibits waitForWriteConcern while having a session checked-out), this this causes a three-thread deadlock with the replication coordinator:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;T1: MigrationDestinationManager has a session checked-out and is waiting on waitForWriteConcern, which in turn is blocked on JournalFlusher::waitForJournalFlush&lt;/li&gt;
	&lt;li&gt;T2: The JournalFlusher is waiting on a MODE_IX RSM lock, which is held in MODE_X by ReplCoord-3&lt;/li&gt;
	&lt;li&gt;T3: ReplCoord-3, while holding the RSM lock in MODE_X, is killing sessions by calling invalidateSessionsForStepdown and this is blocked on the session checked-out by T1&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="1374727">SERVER-48641</key>
            <summary>Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="jack.mulrow@mongodb.com">Jack Mulrow</assignee>
                                    <reporter username="kaloian.manassiev@mongodb.com">Kaloian Manassiev</reporter>
                        <labels>
                            <label>KP44</label>
                    </labels>
                <created>Mon, 8 Jun 2020 12:52:37 +0000</created>
                <updated>Sun, 29 Oct 2023 22:07:19 +0000</updated>
                            <resolved>Thu, 16 Jul 2020 14:28:10 +0000</resolved>
                                    <version>4.4.0-rc8</version>
                                    <fixVersion>4.4.1</fixVersion>
                    <fixVersion>4.7.0</fixVersion>
                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="3335425" author="xgen-internal-githook" created="Wed, 12 Aug 2020 20:59:07 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Jack Mulrow&apos;, &apos;email&apos;: &apos;jack.mulrow@mongodb.com&apos;, &apos;username&apos;: &apos;jsmulrow&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48641&quot; title=&quot;Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48641&quot;&gt;&lt;del&gt;SERVER-48641&lt;/del&gt;&lt;/a&gt; &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48689&quot; title=&quot;MigrationDestinationManager waits for thread to join with session checked out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48689&quot;&gt;&lt;del&gt;SERVER-48689&lt;/del&gt;&lt;/a&gt; Yield session in migration destination driver when waiting on replication and session migration&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 21b083c7352704fc8c3d8a4f33c54040259ff766)&lt;br/&gt;
Branch: v4.4&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/91f3ad01c5fe5599d9ba679a659745fa3b7eb00b&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/91f3ad01c5fe5599d9ba679a659745fa3b7eb00b&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3307480" author="tess.avitabile" created="Mon, 27 Jul 2020 15:18:25 +0000"  >&lt;p&gt;Great, thank you!&lt;/p&gt;</comment>
                            <comment id="3307466" author="esha.maharishi@10gen.com" created="Mon, 27 Jul 2020 15:15:35 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=kaloian.manassiev&quot; class=&quot;user-hover&quot; rel=&quot;kaloian.manassiev&quot;&gt;kaloian.manassiev&lt;/a&gt; yes, they are two different deadlocks that had the same root cause. Both deadlocks should only have existed on 4.4, since they were due to code introduced to the MigrationDestinationManager in 4.4.&lt;/p&gt;</comment>
                            <comment id="3307148" author="kaloian.manassiev" created="Mon, 27 Jul 2020 13:40:02 +0000"  >&lt;p&gt;This specific bug is entirely new for 4.4, so it has no effect on 4.2. I don&apos;t understand the difference between it and &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48689&quot; title=&quot;MigrationDestinationManager waits for thread to join with session checked out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48689&quot;&gt;&lt;del&gt;SERVER-48689&lt;/del&gt;&lt;/a&gt;, except that maybe the latter is different manifestation of the same problem. In either case, neither of the two should be present in 4.2.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=esha.maharishi&quot; class=&quot;user-hover&quot; rel=&quot;esha.maharishi&quot;&gt;esha.maharishi&lt;/a&gt;, I think Jack is on vacation - can you confirm my understanding?&lt;/p&gt;</comment>
                            <comment id="3307127" author="tess.avitabile" created="Mon, 27 Jul 2020 13:29:58 +0000"  >&lt;p&gt;Does this affect 4.2? We need to backport&#160;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47645&quot; title=&quot;Must invalidate all sessions on step down&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47645&quot;&gt;&lt;del&gt;SERVER-47645&lt;/del&gt;&lt;/a&gt; to 4.2.&lt;/p&gt;</comment>
                            <comment id="3287910" author="xgen-internal-githook" created="Thu, 16 Jul 2020 14:07:05 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Jack Mulrow&apos;, &apos;email&apos;: &apos;jack.mulrow@mongodb.com&apos;, &apos;username&apos;: &apos;jsmulrow&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48641&quot; title=&quot;Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48641&quot;&gt;&lt;del&gt;SERVER-48641&lt;/del&gt;&lt;/a&gt; &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48689&quot; title=&quot;MigrationDestinationManager waits for thread to join with session checked out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48689&quot;&gt;&lt;del&gt;SERVER-48689&lt;/del&gt;&lt;/a&gt; Yield session in migration destination driver when waiting on replication and session migration&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/21b083c7352704fc8c3d8a4f33c54040259ff766&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/21b083c7352704fc8c3d8a4f33c54040259ff766&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3276324" author="jack.mulrow" created="Wed, 8 Jul 2020 23:28:29 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=alex.taskov&quot; class=&quot;user-hover&quot; rel=&quot;alex.taskov&quot;&gt;alex.taskov&lt;/a&gt;, &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=matthew.saltz&quot; class=&quot;user-hover&quot; rel=&quot;matthew.saltz&quot;&gt;matthew.saltz&lt;/a&gt;, what do you think of the following proposed fix? (Tagging you both since you were on the resumable range deleter project. Also CC &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=esha.maharishi&quot; class=&quot;user-hover&quot; rel=&quot;esha.maharishi&quot;&gt;esha.maharishi&lt;/a&gt;&#160;for when she&apos;s back from vacation.)&lt;/p&gt;

&lt;p&gt;As far as I can tell, the reasons we check out the session for the entire recipient logic is to both detect when a _recvChunkStart begins after a migration has already finished and been cleaned up (due to a split brain) and so when the transaction number on the recipient is advanced as part of recovering a migration, the number can only be advanced before or after all of the recipient logic, so the recovery can safely delete the range deletion document on the recipient and trigger a range deletion (otherwise orphans from an active cloning phase might be inserted after the deletion). Am I missing any reasons?&lt;/p&gt;

&lt;p&gt;If that&apos;s true, then I think we can fix this problem (and &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48689&quot; title=&quot;MigrationDestinationManager waits for thread to join with session checked out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48689&quot;&gt;&lt;del&gt;SERVER-48689&lt;/del&gt;&lt;/a&gt;) by either:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Making the recipient yield the session every place it waits for write concern and when it joins the session migration thread (to fix &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48689&quot; title=&quot;MigrationDestinationManager waits for thread to join with session checked out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48689&quot;&gt;&lt;del&gt;SERVER-48689&lt;/del&gt;&lt;/a&gt;) and have it verify the active transaction number has not changed immediately upon checking back out the session (to guarantee no orphans can be inserted after the number is advanced by the recovery). I don&apos;t think the session migration can generate orphan documents, so it should be ok for it to run without the migration session checked out.&lt;/li&gt;
	&lt;li&gt;Only check out the session when writing to config.rangeDeletions &lt;a href=&quot;https://github.com/mongodb/mongo/blob/0b2b705b0de0e0f7f8ba28604dc5585a5dc5ba0b/src/mongo/db/s/migration_destination_manager.cpp#L878&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt; (that&apos;s the only recipient initiated write that actually uses the retryable writes machinery from what I can tell) and change migration recovery to wait for the recipient to complete some other way, e.g. sending _recvChunkStatus in a loop.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;What do you guys think? I slightly prefer approach 1), since I expect it would be easier to implement, although it might be trickier to test.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10420">
                    <name>Backports</name>
                                            <outwardlinks description="backported by">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="1319884">SERVER-47645</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="2237902">SERVER-73106</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="1377106">SERVER-48689</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_12450" key="com.atlassian.jira.plugin.system.customfieldtypes:multicheckboxes">
                        <customfieldname>Backport Requested</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="18953"><![CDATA[v4.4]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 8 Jul 2020 23:28:29 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 years, 26 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 years, 26 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>40.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>esha.maharishi@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>jack.mulrow@mongodb.com</customfieldvalue>
            <customfieldvalue>kaloian.manassiev@mongodb.com</customfieldvalue>
            <customfieldvalue>tess.avitabile@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hxox73:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hxc1n3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="3954">Sharding 2020-07-13</customfieldvalue>
    <customfieldvalue id="4135">Sharding 2020-07-27</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hxojgf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>