<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:50:58 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-39082] Retry on TransientTransaction errors in targeted transaction jstests</title>
                <link>https://jira.mongodb.org/browse/SERVER-39082</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;If a jstest that manually runs transaction statements fails with a TransientTransaction error, it is safe to retry the whole transaction, but it currently has no way to retry the previous statements, so the test will fail unnecessarily. The suites that run &lt;tt&gt;core/txns&lt;/tt&gt; tests against sharded clusters are particularly susceptible to this, because of the chance of snapshot errors on slow hosts.&lt;/p&gt;

&lt;p&gt;Each targeted test / suite could use an override file (similar to the existing &lt;tt&gt;txn_override.js&lt;/tt&gt;) that batches transaction operations and supports advancing a session&apos;s txnNumber transparently to do this. The override wouldn&apos;t have to handle network errors or failovers, just transient transaction errors.&lt;/p&gt;</description>
                <environment></environment>
        <key id="674824">SERVER-39082</key>
            <summary>Retry on TransientTransaction errors in targeted transaction jstests</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="12300">Won&apos;t Do</resolution>
                                        <assignee username="haley.connelly@mongodb.com">Haley Connelly</assignee>
                                    <reporter username="jack.mulrow@mongodb.com">Jack Mulrow</reporter>
                        <labels>
                            <label>sharding-4.4-stabilization</label>
                            <label>sharding-wfbf-day</label>
                    </labels>
                <created>Fri, 18 Jan 2019 16:48:52 +0000</created>
                <updated>Tue, 31 Mar 2020 15:29:50 +0000</updated>
                            <resolved>Tue, 31 Mar 2020 15:29:50 +0000</resolved>
                                                                    <component>Sharding</component>
                    <component>Testing Infrastructure</component>
                                        <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="3019255" author="haley.connelly" created="Tue, 31 Mar 2020 15:29:06 +0000"  >&lt;p&gt;There are no failures on the 4.2, 4.4, or master branches that we&apos;ve seen recently. &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=judah.schvimer&quot; class=&quot;user-hover&quot; rel=&quot;judah.schvimer&quot;&gt;judah.schvimer&lt;/a&gt;&#160;talked about enableMajorityReadConcern=false in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-46510&quot; title=&quot;Concurrent drop-pending notifications can race with subsequent transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-46510&quot;&gt;&lt;del&gt;SERVER-46510&lt;/del&gt;&lt;/a&gt;, but even with --majorityReadConcern=off to resmoke we didn&apos;t see failures on the latest master with the transaction_error_handling.js.&#160;There are still failures on the 4.0 branch. It is still the scenario which Judah described in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-46510&quot; title=&quot;Concurrent drop-pending notifications can race with subsequent transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-46510&quot;&gt;&lt;del&gt;SERVER-46510&lt;/del&gt;&lt;/a&gt;. Closing this as won&apos;t do.&#160;&lt;/p&gt;</comment>
                            <comment id="3013902" author="haley.connelly" created="Mon, 30 Mar 2020 20:57:40 +0000"  >&lt;p&gt;On a 4.0 spawned host, the error can be reproduced by setting&#160;maxTransactionLockRequestTimeoutMillis=1 and running resmoke with&#160;--suites=logical_session_cache_replication_100ms_refresh_jscore_passthrough jstests/core/txns/no_implicit_collection_creation_in_txn.js&#160; on repeat (I did it 100 times).&lt;/p&gt;</comment>
                            <comment id="2150335" author="max.hirschhorn@10gen.com" created="Thu, 14 Feb 2019 23:06:18 +0000"  >&lt;blockquote&gt;
&lt;p&gt;My only problem with this approach is that it will still be possible for a targeted transaction test to fail because of an unexpected SnapshotTooOld on a slow host, but I can&apos;t actually find a failure because of that, so I&apos;d be fine addressing that later if it turns out to be a problem.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;If the test fails due to an &lt;tt&gt;SnapshotTooOld&lt;/tt&gt; error response, then it might be (a) the server has a bug where it isn&apos;t preserving enough history, or (b) the test is race-prone and should use the &lt;tt&gt;WTPreserveSnapshotHistoryIndefinitely&lt;/tt&gt; failpoint to preserve more history. Either way, it sounds reasonable to me to address that separately if it ever comes up.&lt;/p&gt;</comment>
                            <comment id="2150330" author="jack.mulrow" created="Thu, 14 Feb 2019 23:00:02 +0000"  >&lt;p&gt;That&apos;s a good point about raising the max lock timeout. From looking through the linked BFs and their duplicate BFGs, it looks like every one failed trying to take a lock during commit or prepare and it should be fine to raise the limit for the &lt;tt&gt;core/txns&lt;/tt&gt; tests since I don&apos;t think their transactions rely on the timeout (except the ones explicitly testing it, where we can use the original value).&lt;/p&gt;

&lt;p&gt;My only problem with this approach is that it will still be possible for a targeted transaction test to fail because of an unexpected SnapshotTooOld on a slow host, but I can&apos;t actually find a failure because of that, so I&apos;d be fine addressing that later if it turns out to be a problem.&lt;/p&gt;</comment>
                            <comment id="2148059" author="max.hirschhorn@10gen.com" created="Wed, 13 Feb 2019 23:17:18 +0000"  >&lt;blockquote&gt;
&lt;p&gt;In particular, since we don&apos;t expect these errors to happen often, I&apos;m wondering if the test infrastructure can somehow retry an entire jstest when it fails with a transient transaction error instead of making the tests retry internally. Max Hirschhorn, what do you think?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=jack.mulrow&quot; class=&quot;user-hover&quot; rel=&quot;jack.mulrow&quot;&gt;jack.mulrow&lt;/a&gt;, I&apos;d be worried that it may mean if a test does enough multi-statement transactions, then the likelihood of one of them failing with a &lt;tt&gt;TransientTransactionError&lt;/tt&gt; is almost sure, and would therefore cause the entire test to be rerun continuously. Retrying just the multi-statement transaction would increase the likelihood of making forward progress in the test.&lt;/p&gt;

&lt;p&gt;My understanding is that the tests in the &lt;tt&gt;jstests/core/txns/&lt;/tt&gt; directory weren&apos;t written with the expectation that they&apos;d ever encounter transient transactions errors. I think trying to do implicit retries is going to be difficult for some of the reasons you mentioned around mutable state (e.g. the cursor id). Is it possible to change the server&apos;s configuration in the &lt;tt&gt;sharded&amp;#95;&amp;#42;&amp;#95;txns&amp;#42;.yml&lt;/tt&gt; test suites by raising &lt;a href=&quot;https://github.com/mongodb/mongo/blob/r4.1.8/src/mongo/db/transaction_participant.cpp#L67-L74&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the &lt;tt&gt;maxTransactionLockRequestTimeoutMillis&lt;/tt&gt; server parameter&lt;/a&gt; to avoid encountering transient transaction errors? For example, the testing infrastructure sets &lt;a href=&quot;https://github.com/mongodb/mongo/blob/r4.1.8/buildscripts/resmokelib/core/programs.py#L82-L88&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the &lt;tt&gt;transactionLifetimeLimitSeconds&lt;/tt&gt; server parameter&lt;/a&gt; to 3 hours to avoid transaction being killed due to the Evergreen machine being slow.&lt;/p&gt;</comment>
                            <comment id="2145270" author="jack.mulrow" created="Mon, 11 Feb 2019 22:40:08 +0000"  >&lt;p&gt;I&apos;ve spent a little while working on this ticket, and I&apos;m realizing there are a few problems with the override in the description. The main ones are:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Some tests in the linked BFs run concurrent transactions on different sessions, so to cover these cases the override needs to store operations from the latest transaction per session and the global order the test ran them in, which raises questions about how to handle a transient failure in one transaction, but not the others. I think we can handle this by aborting every transaction then retrying every statement in the original order, but I can see writing this taking a while.&lt;/li&gt;
	&lt;li&gt;At least one failing test uses the result from one transaction statement to construct a later statement in the same transaction, so retrying the statements as they were received won&apos;t work if it&apos;s possible for the result from the retry to be different, like this statement in kill_cursors_in_transaction.js &lt;a href=&quot;https://github.com/mongodb/mongo/blob/80f9a13324/jstests/core/txns/kill_cursors_in_transaction.js#L25&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;that kills a cursor established earlier in the transaction&lt;/a&gt; (the cursor established by the retry will have a new cursor id).&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;To resolve the linked BFs, we could write an override without handling these cases and manually add retries to tests that aren&apos;t covered, but that makes me think we should try to find a more generic fix instead. In particular, since we don&apos;t expect these errors to happen often, I&apos;m wondering&#160;if the test infrastructure can somehow retry an entire jstest when it fails with a transient transaction error instead of making the tests retry internally. &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=max.hirschhorn&quot; class=&quot;user-hover&quot; rel=&quot;max.hirschhorn&quot;&gt;max.hirschhorn&lt;/a&gt;, what do you think?&lt;/p&gt;

&lt;p&gt;CC &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=judah.schvimer&quot; class=&quot;user-hover&quot; rel=&quot;judah.schvimer&quot;&gt;judah.schvimer&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="778724">SERVER-41338</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1204954">SERVER-46510</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="576733">SERVER-36310</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 13 Feb 2019 23:17:18 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 years, 45 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>haley.connelly@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 years, 45 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>haley.connelly@mongodb.com</customfieldvalue>
            <customfieldvalue>jack.mulrow@mongodb.com</customfieldvalue>
            <customfieldvalue>max.hirschhorn@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hujppb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hwwldz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2787">Sharding 2019-02-25</customfieldvalue>
    <customfieldvalue id="2824">Sharding 2019-03-11</customfieldvalue>
    <customfieldvalue id="2825">Sharding 2019-03-25</customfieldvalue>
    <customfieldvalue id="2863">Sharding 2019-04-08</customfieldvalue>
    <customfieldvalue id="2864">Sharding 2019-04-22</customfieldvalue>
    <customfieldvalue id="2917">Sharding 2019-05-06</customfieldvalue>
    <customfieldvalue id="3745">Sharding 2020-04-06</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hujbyn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>