<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:52:50 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-39692] Make graceful MongoS shutdown drain all in-progress transactions</title>
                <link>https://jira.mongodb.org/browse/SERVER-39692</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-37344&quot; title=&quot;Implement recovery token for retrying a commit command on a different mongos&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-37344&quot;&gt;&lt;del&gt;SERVER-37344&lt;/del&gt;&lt;/a&gt; implemented recoveryToken support for recovering the outcome over a sharded transaction when running commitTransaction on a recovery mongos (i.e., mongos which has not seen that transaction and doesn&apos;t know the coordinator or participants list).&lt;/p&gt;

&lt;p&gt;In the case of aborting the transaction against a recovery mongos, the driver will still include the recoveryToken (SPEC-1279), but there are situations where the recovery token might still not be known, which means parts of the transaction could still remain open for up to the max transaction lifetime, potentially blocking other operations.&lt;/p&gt;

&lt;p&gt;Since in such a case, neither the participants nor the coordinator might be known (especially with read-only shard optimizations), the only deterministic way of ensuring that the transaction vestiges have been aborted is to broadcast abortTransaction to all shards in the cluster. However, this is not a scalable solution and it is also a possibility for DOS attack, so instead as part of this ticket we will do the next best thing:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;Make the graceful MongoS shutdown logic do a best-effort abortTransaction for all in-progress transaction routers. That way we ensure that on maintenance shutdowns we will not leave open transactions.&lt;/li&gt;
	&lt;li&gt;Document the cases where in 4.2 we can leave transactions hanging for a minute and manual recovery steps that operator might be able to take if they want to clear that state before the transactions expire. That would be the case where MongoS hard crashes after having started transaction on a shard, but before any recovery information is returned to the driver.&lt;/li&gt;
	&lt;li&gt;Post-4.2.0 figure out a format for the recovery token, which contains the set of shards, which were involved as part of the transaction so far. The issues to be considered here are around how large that token can get, because shard ids are strings and theoretically, there is a possibility to exceed the BSON max size.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="701436">SERVER-39692</key>
            <summary>Make graceful MongoS shutdown drain all in-progress transactions</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="randolph@mongodb.com">Randolph Tan</assignee>
                                    <reporter username="shane.harvey@mongodb.com">Shane Harvey</reporter>
                        <labels>
                            <label>ShardedTxn:FutureOptimizations</label>
                            <label>neweng</label>
                            <label>pm-564</label>
                    </labels>
                <created>Wed, 20 Feb 2019 19:40:28 +0000</created>
                <updated>Sun, 29 Oct 2023 22:23:49 +0000</updated>
                            <resolved>Thu, 18 Jul 2019 18:09:36 +0000</resolved>
                                                    <fixVersion>4.2.0-rc5</fixVersion>
                    <fixVersion>4.3.1</fixVersion>
                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>7</watches>
                                                                                                                <comments>
                            <comment id="2332661" author="xgen-internal-githook" created="Thu, 18 Jul 2019 19:20:33 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Randolph Tan&apos;, &apos;email&apos;: &apos;randolph@10gen.com&apos;, &apos;username&apos;: &apos;renctan&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39692&quot; title=&quot;Make graceful MongoS shutdown drain all in-progress transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39692&quot;&gt;&lt;del&gt;SERVER-39692&lt;/del&gt;&lt;/a&gt; Make mongos shutdown drain all in-progress transactions&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 36dc61299993ce6473a4660150bfb25a59afce77)&lt;br/&gt;
Branch: v4.2&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/52fba7897df46f8a52e590b6cc3a05a24aa93bed&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/52fba7897df46f8a52e590b6cc3a05a24aa93bed&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2332532" author="xgen-internal-githook" created="Thu, 18 Jul 2019 18:08:48 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;Randolph Tan&apos;, &apos;email&apos;: &apos;randolph@10gen.com&apos;, &apos;username&apos;: &apos;renctan&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39692&quot; title=&quot;Make graceful MongoS shutdown drain all in-progress transactions&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39692&quot;&gt;&lt;del&gt;SERVER-39692&lt;/del&gt;&lt;/a&gt; Make mongos shutdown drain all in-progress transactions&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/36dc61299993ce6473a4660150bfb25a59afce77&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/36dc61299993ce6473a4660150bfb25a59afce77&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2241406" author="greg.mckeon" created="Fri, 10 May 2019 14:37:47 +0000"  >&lt;p&gt;The work here is for drivers to address the first bullet, and us to address the second bullet in Kal&apos;s comment.&lt;/p&gt;</comment>
                            <comment id="2213954" author="kaloian.manassiev" created="Mon, 15 Apr 2019 17:54:02 +0000"  >&lt;p&gt;Like Andy describes above, broadcasting &lt;tt&gt;abortTransaction&lt;/tt&gt; to all shards in the cluster is not a scalable solution and it is also a possibility for DOS attack.&lt;/p&gt;

&lt;p&gt;I propose that we do the following&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;For backwards compatibility purposes with post-4.2.0 fixes, make the drivers include the recoveryToken for both commit and abortTransaction (&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=shane.harvey&quot; class=&quot;user-hover&quot; rel=&quot;shane.harvey&quot;&gt;shane.harvey&lt;/a&gt; opened SPEC-1279)&lt;/li&gt;
	&lt;li&gt;Make the graceful MongoS shutdown logic do a best-effort &lt;tt&gt;abortTransaction&lt;/tt&gt; all in-progress transaction routers. That way we ensure that on maintenance shutdowns we will not leave open transactions (tracked by this ticket)&lt;/li&gt;
	&lt;li&gt;Document the cases where in 4.2 we can leave transactions hanging for a minute. That would be the case where MongoS hard crashes (tracked by this ticket)&lt;/li&gt;
	&lt;li&gt;Post-4.2.0 figure out a format for the recovery token, which contains the set of shards, which were involved as part of the transaction so far. The issues to be considered here are around how large that token can get, because shard ids are strings and theoretically, there is a possibility to exceed the BSON max size.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=alyson.cabral&quot; class=&quot;user-hover&quot; rel=&quot;alyson.cabral&quot;&gt;alyson.cabral&lt;/a&gt;, &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=shane.harvey&quot; class=&quot;user-hover&quot; rel=&quot;shane.harvey&quot;&gt;shane.harvey&lt;/a&gt;, does this plan sound good to you?&lt;/p&gt;</comment>
                            <comment id="2163355" author="esha.maharishi@10gen.com" created="Tue, 26 Feb 2019 18:46:25 +0000"  >&lt;p&gt;I am unlinking this from &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-39726&quot; title=&quot;Recovering the state of an uncommitted transaction should not block&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-39726&quot;&gt;&lt;del&gt;SERVER-39726&lt;/del&gt;&lt;/a&gt;, since aborting a transaction through abortTransaction against a recovery router to expedite releasing the transaction&apos;s resources across the cluster is unrelated to avoiding blocking while recovering a transaction&apos;s decision through commitTransaction against a recovery router.&lt;/p&gt;</comment>
                            <comment id="2158109" author="schwerin" created="Thu, 21 Feb 2019 09:16:35 +0000"  >&lt;p&gt;Broadcast is not an acceptable solution for lost mongos. There are a few other reasons we&#8217;ve considered updating the recovery token on subsequent transaction statements, so &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=shane.harvey&quot; class=&quot;user-hover&quot; rel=&quot;shane.harvey&quot;&gt;shane.harvey&lt;/a&gt;&#8217;s proposal is interesting.&lt;/p&gt;</comment>
                            <comment id="2157758" author="shane.harvey" created="Wed, 20 Feb 2019 22:19:03 +0000"  >&lt;p&gt;I vaguely recall &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=schwerin&quot; class=&quot;user-hover&quot; rel=&quot;schwerin&quot;&gt;schwerin&lt;/a&gt; and &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=esha.maharishi&quot; class=&quot;user-hover&quot; rel=&quot;esha.maharishi&quot;&gt;esha.maharishi&lt;/a&gt; saying that a broadcast to all shards approach would not be acceptable. &lt;/p&gt;

&lt;p&gt;In any case, I&apos;ve reduced the scope of SPEC-1168 to have drivers only include the recoveryToken on commitTransaction and never on abortTransaction so any changes made for this ticket will need driver changes.&lt;/p&gt;</comment>
                            <comment id="2157666" author="renctan" created="Wed, 20 Feb 2019 21:25:19 +0000"  >&lt;p&gt;One way to do this is to have abortTransaction have a mode that makes the mongos broadcast to all shards.&lt;/p&gt;</comment>
                            <comment id="2157516" author="shane.harvey" created="Wed, 20 Feb 2019 20:02:11 +0000"  >&lt;p&gt;Right now, my proposal in SPEC-1168 is for drivers to track the most recently seen &lt;tt&gt;recoveryToken&lt;/tt&gt; and send the &lt;tt&gt;recoveryToken&lt;/tt&gt; along with commitTransaction as well as abortTransaction. This allows the server to update the &lt;tt&gt;recoveryToken&lt;/tt&gt; to include the participant list and therefore a recovery mongos can then use the recoveryToken field to abort the transaction.&lt;/p&gt;

&lt;p&gt;Note: a &lt;tt&gt;recoveryToken&lt;/tt&gt; that includes new participant(s) could be lost due to a network error so the recovery mongos may leave the transaction open on some participants (unless the abort is broadcasted to all shards) but this is still better than leaving the transaction open on &lt;b&gt;all participants&lt;/b&gt;.&lt;/p&gt;

&lt;p&gt;Even if this is not implemented, the behavior on running abort on a recovery mongos must be defined. Consider that even if drivers remain pinned to the same mongos, the mongos could restart and loose the in memory transaction state. In this case, the driver sends abort to a recovery mongos without even realizing it.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10420">
                    <name>Backports</name>
                                            <outwardlinks description="backported by">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="610799">SERVER-37344</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="21788">SERVER-3744</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_12450" key="com.atlassian.jira.plugin.system.customfieldtypes:multicheckboxes">
                        <customfieldname>Backport Requested</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16775"><![CDATA[v4.2]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 20 Feb 2019 21:25:19 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 29 weeks, 6 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 29 weeks, 6 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>schwerin@mongodb.com</customfieldvalue>
            <customfieldvalue>esha.maharishi@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>greg.mckeon@mongodb.com</customfieldvalue>
            <customfieldvalue>kaloian.manassiev@mongodb.com</customfieldvalue>
            <customfieldvalue>randolph@mongodb.com</customfieldvalue>
            <customfieldvalue>shane.harvey@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|huo6nj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hug0c7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="2863">Sharding 2019-04-08</customfieldvalue>
    <customfieldvalue id="3003">Sharding 2019-07-01</customfieldvalue>
    <customfieldvalue id="3061">Sharding 2019-07-15</customfieldvalue>
    <customfieldvalue id="3062">Sharding 2019-07-29</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hunswv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>