<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:18:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-71198] Assert that unkillable operations that take X collection locks do not hold the RSTL</title>
                <link>https://jira.mongodb.org/browse/SERVER-71198</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;The deadlocks described in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-71191&quot; title=&quot;Deadlock between index build setup, prepared transaction, and stepdown&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-71191&quot;&gt;&lt;del&gt;SERVER-71191&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-44722&quot; title=&quot;3 way deadlock can happen between hybrid index build, prepared transactions and stepdown thread on primary that runs index build via coordinator.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-44722&quot;&gt;&lt;del&gt;SERVER-44722&lt;/del&gt;&lt;/a&gt; are caused because the following conditions are true, in general&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;An operation is unkillable by stepdown&lt;/li&gt;
	&lt;li&gt;While holding the RSTL in IX mode, an operation takes an X collection lock. If there are any prepared transactions, this blocks.&lt;/li&gt;
	&lt;li&gt;The stepdown thread tries to acquire the X lock, but blocks because of a conflict&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In this situation, the operation that isn&apos;t interrupted by stepdown does not make progress. We should add an assertion to our lock helpers that prevent unkillable operations from taking X locks while also holding the RSTL.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2178905">SERVER-71198</key>
            <summary>Assert that unkillable operations that take X collection locks do not hold the RSTL</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-server-execution">Backlog - Storage Execution Team</assignee>
                                    <reporter username="louis.williams@mongodb.com">Louis Williams</reporter>
                        <labels>
                    </labels>
                <created>Wed, 9 Nov 2022 08:29:22 +0000</created>
                <updated>Wed, 19 Jul 2023 10:14:03 +0000</updated>
                                                                                                <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="5311859" author="JIRAUSER1264163" created="Thu, 30 Mar 2023 13:33:35 +0000"  >&lt;p&gt;As of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-70127&quot; title=&quot;Default system operations to be killable by stepdown&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-70127&quot;&gt;&lt;del&gt;SERVER-70127&lt;/del&gt;&lt;/a&gt; we can now find all the unkillable operations easily.&lt;/p&gt;</comment>
                            <comment id="5139457" author="JIRAUSER1264163" created="Wed, 25 Jan 2023 13:53:51 +0000"  >&lt;p&gt;In this case, the OplogApplier holds the RSTL lock for applying the operations. It needs it to protect from ReplSet changes. And yes, as a result of applying operations it will inevitably have to hold MODE_X locks for doing DDL operations on collections. As is also important, the OplogApplier threads are not marked as killable and thus fall under the &quot;unkillable system connection&quot; umbrella.&lt;/p&gt;

&lt;p&gt;The entirety of initial sync machinery also takes the lock and acquires strong MODE_X locks.&lt;/p&gt;

&lt;p&gt;The biggest problem I found is that oddly enough the method for marking a client connection as unkillable is the inverse. That is, &lt;b&gt;by default all operations are unkillable and are then explicitly marked as killable&lt;/b&gt; using &lt;tt&gt;Client::canKillSystemOperationInStepdown&lt;/tt&gt;. Finding all system connections that are unkillable thus becomes impossible to do with a search on the method. Instead the only thing possible that can &quot;find&quot; the clients that are unkillable is to search for calls to &lt;tt&gt;ServiceContext::makeClient&lt;/tt&gt; and looking for calls that don&apos;t have a session handle.&lt;/p&gt;</comment>
                            <comment id="5137354" author="max.hirschhorn@10gen.com" created="Tue, 24 Jan 2023 19:09:16 +0000"  >&lt;blockquote&gt;
&lt;p&gt;since it would be a false positive in secondaries.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Great, tell me more about this! What is the sequence of lock acquisitions during secondary oplog application you&apos;re describing? Is it that secondary oplog application takes the RSTL IX lock in an uninterruptible Client and and then also takes a collection X lock when applying certain ops?&lt;/p&gt;

&lt;p&gt;Is the RSTL lock relevant for how secondary oplog application must run? Or is it that we lack a different mechanism to have the writer threads opt&amp;#45;out of it?&lt;/p&gt;

&lt;p&gt;I&apos;m fully bought into prepared transactions wouldn&apos;t be part of a deadlock on secondaries because secondaries don&apos;t hold any locks for prepared transactions. I&apos;d like to understand the lock story more in general because other collection locks may still be taken on a secondary by operations which aren&apos;t related to prepared transactions.&lt;/p&gt;</comment>
                            <comment id="5133790" author="JIRAUSER1264163" created="Mon, 23 Jan 2023 17:44:43 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=max.hirschhorn%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;max.hirschhorn@mongodb.com&quot;&gt;max.hirschhorn@mongodb.com&lt;/a&gt; Secondaries not holding locks means the described deadlock cannot happen in them. In this case it is only Primaries that hold the locks and can deadlock. This is because only primaries need to control for writes, secondaries apply all the writes atomically as they&apos;ll receive the full transaction operations by the primary when it commits there. It&apos;s as if they perform an applyOps with all the writes.&lt;/p&gt;

&lt;p&gt;To check for the conditions described in this ticket requires knowing what is the current node&apos;s role in the Replica Set since it would be a false positive in secondaries. This is the layering violation that I mention as the locking layer would need to know about replication.&lt;/p&gt;</comment>
                            <comment id="5128559" author="max.hirschhorn@10gen.com" created="Fri, 20 Jan 2023 23:29:34 +0000"  >&lt;p&gt;Thanks for reopening this ticket. &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=jordi.olivares-provencio%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;jordi.olivares-provencio@mongodb.com&quot;&gt;jordi.olivares-provencio@mongodb.com&lt;/a&gt;, would you mind adding more detail as to how the replica set member state enters into the picture?&lt;/p&gt;

&lt;p&gt;My understanding is the following -&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;An unkillable operation is determined by its Client, or perhaps also whether there&apos;s an UninterruptibleLockGuard object in scope.&lt;/li&gt;
	&lt;li&gt;Whether the OperationContext is holding the RSTL is based on the operation&apos;s current lock state.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;What is it about prepared transactions and how they are applied by secondaries which complicates checking the conditions stated in the description? What is it about secondaries not holding locks for prepared transactions which result in a violation of those conditions?&lt;/p&gt;</comment>
                            <comment id="5126859" author="JIRAUSER1264163" created="Fri, 20 Jan 2023 14:18:55 +0000"  >&lt;p&gt;Re-opening and putting in the backlog. The patch used in order to investigate this can be found here: &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/attachment/427116/427116_diff.patch&quot; title=&quot;diff.patch attached to SERVER-71198&quot;&gt;diff.patch&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.mongodb.org/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;</comment>
                            <comment id="5126666" author="JIRAUSER1264163" created="Fri, 20 Jan 2023 12:15:20 +0000"  >&lt;p&gt;Adding the considered assertion requires a layering violation during locking. The invariant suggested here requires distinguishing between whether the node is in a primary state or not since prepared transactions will only acquire locks in a primary node. This is a layering violation as the locking component shouldn&apos;t have to know anything about the replication layer above it.&lt;/p&gt;

&lt;p&gt;This is due to the following observation from the Replication &lt;a href=&quot;https://github.com/mongodb/mongo/blob/65b5512fb09ac732826b66c3c6de0ed878751872/src/mongo/db/repl/README.md?plain=1#L1007&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;readme&lt;/a&gt; that explains why it should only be checked in a Primary node (emphasis added):&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;Once stepdown finishes, the node will yield locks from all prepared transactions since &lt;b&gt;secondaries don&apos;t hold locks for their transactions&lt;/b&gt;.&lt;/cite&gt;&lt;/p&gt;

&lt;p&gt;As part of this project we discovered only one potential deadlock in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-73036&quot; title=&quot;Investigate potential deadlock with index builds&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-73036&quot;&gt;&lt;del&gt;SERVER-73036&lt;/del&gt;&lt;/a&gt; as other places that would show this invariant were in internal collections that are used by sharding and replication. The use of a prepared transaction there would be a problem on its own as those collections are local to the node.&lt;/p&gt;

&lt;p&gt;Given all the above, we are closing this ticket as Won&apos;t Fix.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1009935">SERVER-44722</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2178707">SERVER-71191</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2297842">SERVER-75288</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="2383665">SERVER-78662</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2237301">SERVER-73036</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2234619">SERVER-72898</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="2234607">SERVER-72897</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="427116" name="diff.patch" size="12548" author="jordi.olivares-provencio@mongodb.com" created="Fri, 20 Jan 2023 14:18:40 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25136"><![CDATA[Storage Execution]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 20 Jan 2023 12:15:20 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        44 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-3075</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>josef.ahmad@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            44 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-execution</customfieldvalue>
            <customfieldvalue>jordi.olivares-provencio@mongodb.com</customfieldvalue>
            <customfieldvalue>louis.williams@mongodb.com</customfieldvalue>
            <customfieldvalue>max.hirschhorn@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i1hgg7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i0xznj:9</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="6554">Execution Team 2022-12-12</customfieldvalue>
    <customfieldvalue id="6672">Execution Team 2022-12-26</customfieldvalue>
    <customfieldvalue id="6673">Execution Team 2023-01-09</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i1h2lj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>