<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:49:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-38660] fuzzer can cause tests to timeout when executing secondary reads with transactions</title>
                <link>https://jira.mongodb.org/browse/SERVER-38660</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Once a secondary read only transaction starts and gets stashed, the applier trying to get the global X will block (like trying to replicate create collection command). And once the X lock is queued, new requests to secondary will queue behind the X lock until it is satisfied/abandoned. This can cause a deadlock scenario like this:&lt;/p&gt;

&lt;p&gt;1. Txn1 starts, do ops, stash locks.&lt;br/&gt;
2. Repl applier request global X, conflicts with stashed locks, and lock request gets queued.&lt;br/&gt;
3. Txn1 continues, checks out session, tries to satisfy read concern, which involves checking if oplog collection exists that requires global IS, so it gets queued behind #2 (note: this is before locks gets unstashed).&lt;br/&gt;
4. Periodic Txn Killer sees that Txn1 is already expired, tries to kill it by checking out the session, but it is blocked waiting for step#3 to check the session back in.&lt;/p&gt;

&lt;p&gt;More notes:&lt;br/&gt;
Periodic Txn Killer actually kills the opCtx of session before trying to check it out, but step#3 is also blocked on pbwm resource mutex while trying to grab GlobalLock. And this operation doesn&apos;t use opCtx so it cannot be interrupted by killOp.&lt;/p&gt;</description>
                <environment></environment>
        <key id="654930">SERVER-38660</key>
            <summary>fuzzer can cause tests to timeout when executing secondary reads with transactions</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="tess.avitabile@mongodb.com">Tess Avitabile</assignee>
                                    <reporter username="randolph@mongodb.com">Randolph Tan</reporter>
                        <labels>
                    </labels>
                <created>Fri, 14 Dec 2018 22:19:38 +0000</created>
                <updated>Wed, 23 Jan 2019 19:14:22 +0000</updated>
                            <resolved>Wed, 23 Jan 2019 16:56:21 +0000</resolved>
                                    <version>4.1.6</version>
                                                                        <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="2124345" author="tess.avitabile" created="Wed, 23 Jan 2019 16:56:21 +0000"  >&lt;p&gt;Closing as a duplicate of&#160;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39139&quot; title=&quot;Remove testing support for secondary transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39139&quot;&gt;&lt;del&gt;SERVER-39139&lt;/del&gt;&lt;/a&gt;, since we will remove support for secondary transactions entirely.&lt;/p&gt;</comment>
                            <comment id="2121312" author="tess.avitabile" created="Sat, 19 Jan 2019 15:31:34 +0000"  >&lt;p&gt;Sure, I can make a separate ticket to disable secondary transactions in the fuzzer. I&apos;ll wait to see what happens with &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39096&quot; title=&quot;Prepared transactions and DDL operations can deadlock on a secondary, if a reader blocks on a prepared document&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39096&quot;&gt;&lt;del&gt;SERVER-39096&lt;/del&gt;&lt;/a&gt; first.&lt;/p&gt;

&lt;p&gt;I don&apos;t think it is worthwhile to test secondary transactions in the fuzzer. Since they are not a supported feature, if they deadlock or crash in a new way, I don&apos;t want us to have to take the time to investigate it.&lt;/p&gt;</comment>
                            <comment id="2121139" author="max.hirschhorn@10gen.com" created="Sat, 19 Jan 2019 00:03:11 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Would it be acceptable to repurpose this ticket to say that the fuzzer must not run transactions on secondaries? An alternative is to wait and see if we need to disable transactions on secondaries entirely due to&#160;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39096&quot; title=&quot;Prepared transactions and DDL operations can deadlock on a secondary, if a reader blocks on a prepared document&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39096&quot;&gt;&lt;del&gt;SERVER-39096&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;In terms of JIRA process, I&apos;d prefer we make a separate TIG ticket for the changes to the fuzzer. (We can close this ticket as &quot;Won&apos;t fix&quot; or Backlog it until transactions on secondaries are a thing.)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tess.avitabile&quot; class=&quot;user-hover&quot; rel=&quot;tess.avitabile&quot;&gt;tess.avitabile&lt;/a&gt;, would it be possible for the fuzzer to continue to run transactions on secondaries so long as none of the statements include a &lt;tt&gt;readConcern&lt;/tt&gt; with &lt;tt&gt;afterCluster&lt;/tt&gt;? It wasn&apos;t clear to me what the actual steps to reproduce the problem were, but we generally try to make the blacklisting as narrow as possible.&lt;/p&gt;</comment>
                            <comment id="2120968" author="tess.avitabile" created="Fri, 18 Jan 2019 21:29:42 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=max.hirschhorn&quot; class=&quot;user-hover&quot; rel=&quot;max.hirschhorn&quot;&gt;max.hirschhorn&lt;/a&gt;, I think the best way to resolve this is for the fuzzer to not run transactions on secondaries. Secondary read-only transactions are not supported outside of tests, and we do not require that they do not cause deadlocks. A background dbhash check would never encounter this scenario, since it (correctly) does not send &lt;tt&gt;readConcern&lt;/tt&gt; after the first command in a transaction.&lt;/p&gt;

&lt;p&gt;Would it be acceptable to repurpose this ticket to say that the fuzzer must not run transactions on secondaries? An alternative is to wait and see if we need to disable transactions on secondaries entirely due to&#160;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39096&quot; title=&quot;Prepared transactions and DDL operations can deadlock on a secondary, if a reader blocks on a prepared document&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39096&quot;&gt;&lt;del&gt;SERVER-39096&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="2107991" author="max.hirschhorn@10gen.com" created="Mon, 7 Jan 2019 22:15:44 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Is the secondary read only transaction that&apos;s occurring here a dbhash? If we don&apos;t use background dbhash checks in the fuzzer, we can ban secondary read only transactions in the fuzzer.&lt;/p&gt;

&lt;p&gt;If it is a background dbhash check, this might be fixed by &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-38341&quot; title=&quot;Remove Parallel Batch Writer Mutex&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-38341&quot;&gt;&lt;del&gt;SERVER-38341&lt;/del&gt;&lt;/a&gt;, which will remove the pbwm lock.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Just to clarify - We don&apos;t run the &lt;tt&gt;CheckReplDBHashInBackground&lt;/tt&gt; hook during the &lt;tt&gt;jstestfuzz*&lt;/tt&gt; test suites. I&apos;d say (1) it&apos;s unlikely for the fuzzer to do a large enough volume of writes to exercise timestamping in a particularly interesting way to make the check worthwhile, and (2) the manipulation of the &lt;tt&gt;transactionLifetimeLimitSeconds&lt;/tt&gt; server parameter to be &lt;tt&gt;1e9&lt;/tt&gt; could be inherited by concurrent fuzzer operations and would lead to Evergreen timeouts since the fuzzer isn&apos;t guaranteed to abort the transactions it starts.&lt;/p&gt;</comment>
                            <comment id="2107722" author="renctan" created="Mon, 7 Jan 2019 20:13:49 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=greg.mckeon&quot; class=&quot;user-hover&quot; rel=&quot;greg.mckeon&quot;&gt;greg.mckeon&lt;/a&gt; I don&apos;t think so. The fuzzer had a read pref of secondary only and it can run arbitrary commands (since it&apos;s the fuzzer test). In the build failure, if I remember correctly, was just a plain find on the secondary. If I also remember correctly, the failure is easy to reproduce, I had a lot of success reproducing the hang running the generated test file locally.&lt;/p&gt;</comment>
                            <comment id="2107692" author="judah.schvimer" created="Mon, 7 Jan 2019 20:06:01 +0000"  >&lt;p&gt;The PBWM is not interruptible because it uses the &lt;tt&gt;lock&lt;/tt&gt; function &lt;a href=&quot;https://github.com/mongodb/mongo/blob/58454a7a2f5731b6810811d19816581b9e30b4ea/src/mongo/db/concurrency/lock_state.h#L171&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;that is not interruptible&lt;/a&gt;. Should we audit all users of this and remove any that are problematic or unnecessary? Any instance of &lt;a href=&quot;https://github.com/mongodb/mongo/blob/58454a7a2f5731b6810811d19816581b9e30b4ea/src/mongo/db/concurrency/d_concurrency.cpp#L346&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;&lt;tt&gt;ResourceLock&lt;/tt&gt;&lt;/a&gt; falls into this category. CC &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=geert.bosch&quot; class=&quot;user-hover&quot; rel=&quot;geert.bosch&quot;&gt;geert.bosch&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2107566" author="siyuan.zhou@10gen.com" created="Mon, 7 Jan 2019 19:12:02 +0000"  >&lt;p&gt;We are also wondering why the second operation in the transaction need to wait for read concern. Does the second transactional operation come with a read concern?&lt;/p&gt;

&lt;p&gt;As discussed with &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=milkie&quot; class=&quot;user-hover&quot; rel=&quot;milkie&quot;&gt;milkie&lt;/a&gt; weeks ago, another possible fix is to compare&#160;&lt;tt&gt;lastApplied&lt;/tt&gt;&#160;from the replication coordinator, rather than the top of the oplog, with the timestamp from the storage, when waiting for all earlier writes to commit. We always maintain&#160;&lt;tt&gt;lastApplied&lt;/tt&gt; on both primary and secondaries.&lt;/p&gt;</comment>
                            <comment id="2107541" author="greg.mckeon" created="Mon, 7 Jan 2019 18:58:58 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=renctan&quot; class=&quot;user-hover&quot; rel=&quot;renctan&quot;&gt;renctan&lt;/a&gt; or &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=max.hirschhorn&quot; class=&quot;user-hover&quot; rel=&quot;max.hirschhorn&quot;&gt;max.hirschhorn&lt;/a&gt;&lt;br/&gt;
Is the secondary read only transaction that&apos;s occurring here a dbhash?  If we don&apos;t use background dbhash checks in the fuzzer, we can ban secondary read only transactions in the fuzzer.&lt;/p&gt;

&lt;p&gt;If it is a background dbhash check, this might be fixed by &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-38341&quot; title=&quot;Remove Parallel Batch Writer Mutex&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-38341&quot;&gt;&lt;del&gt;SERVER-38341&lt;/del&gt;&lt;/a&gt;, which will remove the pbwm lock.&lt;/p&gt;</comment>
                            <comment id="2098810" author="renctan" created="Fri, 21 Dec 2018 19:50:36 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=max.hirschhorn&quot; class=&quot;user-hover&quot; rel=&quot;max.hirschhorn&quot;&gt;max.hirschhorn&lt;/a&gt; I think I know what is going on. It looks like the session killer is running but there&apos;s a deadlock that made it stuck. Updated the description.&lt;/p&gt;</comment>
                            <comment id="2097513" author="renctan" created="Thu, 20 Dec 2018 19:00:33 +0000"  >&lt;p&gt;I haven&apos;t looked at the exact operation, but I believe they were read only transactions since I think the transaction participant didn&apos;t have any ops stashed, but has the locks held. My initial though was it should have expired as well, but it was not getting killed (I can semi-reliably repro this on my local machine). Then I saw the 3hr transaction lifetime limits. After looking at then again today, they appear to just be TestData variables. I&apos;ll try to look again tomorrow.&lt;/p&gt;</comment>
                            <comment id="2094550" author="max.hirschhorn@10gen.com" created="Tue, 18 Dec 2018 16:47:16 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Setup: Fuzzer is currently set to have a transaction lifetime of 3hrs.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=renctan&quot; class=&quot;user-hover&quot; rel=&quot;renctan&quot;&gt;renctan&lt;/a&gt;, &lt;a href=&quot;https://github.com/mongodb/mongo/blob/ca80f766ffc4f82beb46e7b13132afc0c555d4aa/buildscripts/resmokeconfig/suites/jstestfuzz_sharded_causal_consistency.yml#L41&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the &lt;tt&gt;jstestfuzz*&lt;/tt&gt; test suites set a transaction lifetime of 1 second&lt;/a&gt;. What transaction were you expecting to be aborted by the reaper that wasn&apos;t?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="676205">SERVER-39139</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="676205">SERVER-39139</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 18 Dec 2018 16:47:16 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 years, 3 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>tess.avitabile@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 years, 3 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>greg.mckeon@mongodb.com</customfieldvalue>
            <customfieldvalue>judah.schvimer@mongodb.com</customfieldvalue>
            <customfieldvalue>max.hirschhorn@mongodb.com</customfieldvalue>
            <customfieldvalue>randolph@mongodb.com</customfieldvalue>
            <customfieldvalue>siyuan.zhou@mongodb.com</customfieldvalue>
            <customfieldvalue>tess.avitabile@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hugga7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hu6j1j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2702">Repl 2019-01-28</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hug2jj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>