<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:22:21 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-72622] Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.</title>
                <link>https://jira.mongodb.org/browse/SERVER-72622</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Tenant oplog applier first apply writes and then write no-ops for each donor oplog entries in a given oplog batch and then &lt;a href=&quot;https://github.com/10gen/mongo/blob/451eb6b2a5c77b664db74ca966b9bf9706490e6b/src/mongo/db/repl/tenant_oplog_applier.cpp#L481-L513&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;these no-ops are written in parallel using the writerpool threads&lt;/a&gt;.  And to calculate the &lt;a href=&quot;https://github.com/10gen/mongo/blob/451eb6b2a5c77b664db74ca966b9bf9706490e6b/src/mongo/db/repl/tenant_migration_recipient_service.cpp#L1979&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;resume point&lt;/a&gt; on recipient failover, we traverse backwards through the oplog collection and find the most recent no-op oplog entry from the current migration. Due to this code logic, resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries. The consequence of this would be &lt;br/&gt;
1) we might miss updating the session entries in config.transactions table for multi-statement replica set transactions, leading to duplicate transaction commit &lt;br/&gt;
2) Missing oplog chain for retryable writes &lt;br/&gt;
3) Change streams might miss generating the change event. &lt;/p&gt;</description>
                <environment></environment>
        <key id="2227873">SERVER-72622</key>
            <summary>Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="12300">Won&apos;t Do</resolution>
                                        <assignee username="christopher.caplinger@mongodb.com">Christopher Caplinger</assignee>
                                    <reporter username="suganthi.mani@mongodb.com">Suganthi Mani</reporter>
                        <labels>
                    </labels>
                <created>Mon, 9 Jan 2023 15:35:56 +0000</created>
                <updated>Thu, 15 Jun 2023 22:28:51 +0000</updated>
                            <resolved>Thu, 15 Jun 2023 22:28:51 +0000</resolved>
                                                                                        <votes>0</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="5353477" author="JIRAUSER1262830" created="Mon, 17 Apr 2023 18:20:16 +0000"  >&lt;p&gt;Due to the complexity of the fix and the side-effects faced while working on it, we decided to abort Tenant Migration in case of recipient failovers &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-75990&quot; title=&quot;Tenant Migrations are not resilient to recipient failover&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-75990&quot;&gt;&lt;del&gt;SERVER-75990&lt;/del&gt;&lt;/a&gt;. Therefore we close this ticket as won&apos;t do (the commit were rolled back).&lt;/p&gt;</comment>
                            <comment id="5335116" author="xgen-internal-githook" created="Mon, 10 Apr 2023 14:38:10 +0000"  >&lt;p&gt;Author: &lt;/p&gt;
{&apos;name&apos;: &apos;Christopher Caplinger&apos;, &apos;email&apos;: &apos;christopher.caplinger@mongodb.com&apos;, &apos;username&apos;: &apos;UnicodeSnowman&apos;}
&lt;p&gt;Message: Revert &quot;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-72622&quot; title=&quot;Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-72622&quot;&gt;&lt;del&gt;SERVER-72622&lt;/del&gt;&lt;/a&gt;: Track TenantOplogApplier progress in replicated collection&quot;&lt;/p&gt;

&lt;p&gt;This reverts commit 636ad08bdb6fb5c9ad98ee4cdde8f52929c29830.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/c20a1829195384e6f9737cdba13b850364366e1d&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/c20a1829195384e6f9737cdba13b850364366e1d&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="5333568" author="xgen-internal-githook" created="Sat, 8 Apr 2023 01:42:44 +0000"  >&lt;p&gt;Author: &lt;/p&gt;
{&apos;name&apos;: &apos;Christopher Caplinger&apos;, &apos;email&apos;: &apos;christopher.caplinger@mongodb.com&apos;, &apos;username&apos;: &apos;UnicodeSnowman&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-72622&quot; title=&quot;Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-72622&quot;&gt;&lt;del&gt;SERVER-72622&lt;/del&gt;&lt;/a&gt;: Track TenantOplogApplier progress in replicated collection&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/636ad08bdb6fb5c9ad98ee4cdde8f52929c29830&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/636ad08bdb6fb5c9ad98ee4cdde8f52929c29830&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="5307112" author="xgen-internal-githook" created="Tue, 28 Mar 2023 19:10:53 +0000"  >&lt;p&gt;Author: &lt;/p&gt;
{&apos;name&apos;: &apos;Suganthi Mani&apos;, &apos;email&apos;: &apos;suganthi.mani@mongodb.com&apos;, &apos;username&apos;: &apos;smani87&apos;}
&lt;p&gt;Message: Revert &quot;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-72622&quot; title=&quot;Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-72622&quot;&gt;&lt;del&gt;SERVER-72622&lt;/del&gt;&lt;/a&gt;: Track TenantOplogApplier progress in replicated collection&quot;&lt;/p&gt;

&lt;p&gt;This reverts commit 3c130a69eaddc7cb44895f57af4da6e39556dbb4.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/726362404fbe9fc1dc75f97fabcb47e76efac8d7&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/726362404fbe9fc1dc75f97fabcb47e76efac8d7&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="5306738" author="suganthi.mani" created="Tue, 28 Mar 2023 17:27:08 +0000"  >&lt;p&gt;Reverting the commit due to build failure. Further inspecting &lt;a href=&quot;https://github.com/mongodb/mongo/commit/3c130a69eaddc7cb44895f57af4da6e39556dbb4&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this&lt;/a&gt; commit changes, it can lead to potential deadlocks and perf regression. So, will revert the commit and rework on the changes to address those problems.&lt;/p&gt;

&lt;p&gt;Writing down the problems in detail with &lt;a href=&quot;https://github.com/mongodb/mongo/commit/3c130a69eaddc7cb44895f57af4da6e39556dbb4&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the commit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;1) The root cause of the BF is that with the new change, it is possible to reapply retryable writes and transaction donor oplog entries on recipient failovers with a transaction number older than the current active transaction number for a given session. We handled it for &lt;a href=&quot;https://github.com/mongodb/mongo/blob/3c130a69eaddc7cb44895f57af4da6e39556dbb4/src/mongo/db/repl/tenant_oplog_applier.cpp#L896-L907&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;statementIds&lt;/a&gt; but we didn&apos;t handle it for transaction numbers already executed case. As a result, the test was throwing &lt;tt&gt;ErrorCodes::TransactionTooOld&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;2) Potential places where we can get into 2-way deadlock between stepdown and TenantOplogApplier startup are, &lt;a href=&quot;https://github.com/mongodb/mongo/blob/3c130a69eaddc7cb44895f57af4da6e39556dbb4/src/mongo/db/repl/tenant_oplog_applier.cpp#L165&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this&lt;/a&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; and &lt;a href=&quot;https://github.com/mongodb/mongo/blob/3c130a69eaddc7cb44895f57af4da6e39556dbb4/src/mongo/db/repl/tenant_oplog_applier.cpp#L166&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this&lt;/a&gt; &lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; line in the tenant oplog applier code (see &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-60872&quot; title=&quot;Deadlock between stepDown and TenantOplogApplier startup&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-60872&quot;&gt;&lt;del&gt;SERVER-60872&lt;/del&gt;&lt;/a&gt; for more info on lock ordering violation ). Additionally, &lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; makes a storage call with the tenant oplog applier mutex lock held which is as anti-pattern. We should try to avoid doing storage call with mutex lock held. Mutex should be taken only for short critical section.&lt;/p&gt;

&lt;p&gt;3) Now with the changes, we are &lt;a href=&quot;https://github.com/mongodb/mongo/blob/3c130a69eaddc7cb44895f57af4da6e39556dbb4/src/mongo/db/repl/tenant_oplog_applier.cpp#L679&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;checking out the session&lt;/a&gt; which reads oplog from disk to load retryable chain and this session gets &lt;a href=&quot;https://github.com/mongodb/mongo/blob/3c130a69eaddc7cb44895f57af4da6e39556dbb4/src/mongo/db/repl/tenant_oplog_applier.cpp#L625&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;invalidated&lt;/a&gt; at the end of each batch. This means, every time a batch should read from disk to load the retryable write oplog chain while checking out session for a given session. This would lead to potential perf regression in tenant migration.&lt;/p&gt;

&lt;p&gt;Additionally, I noted a minor change that on FCV 6.3 and Mongodb binary, we &lt;a href=&quot;https://github.com/10gen/mongo/blob/c4ff531484d888bc2b1eebb02a28df63e41a0437/src/mongo/db/repl/tenant_migration_recipient_service.cpp#L2704&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;scan oplog backwards&lt;/a&gt; to find the resume opTime even without recipient failovers. This is a change in behavior from the original where we try to find &lt;a href=&quot;https://github.com/10gen/mongo/blob/2dbc2a40b841eef00e2ad1b79e3f938bad889c58/src/mongo/db/repl/tenant_migration_recipient_service.cpp#L2677-L2684&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;resume opTime only when the resume phase is `ResumePhase::kOplogCatchup`&#160;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;b&gt;(Update: Uploaded a diff patch to address problem 1 &amp;amp; 3  with base hash commit as 00e9ca719a7ef980e756ba27ba21dec44d69cf0f)&lt;/b&gt;&lt;/p&gt;</comment>
                            <comment id="5290105" author="xgen-internal-githook" created="Tue, 21 Mar 2023 21:03:57 +0000"  >&lt;p&gt;Author: &lt;/p&gt;
{&apos;name&apos;: &apos;Christopher Caplinger&apos;, &apos;email&apos;: &apos;christopher.caplinger@mongodb.com&apos;, &apos;username&apos;: &apos;UnicodeSnowman&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-72622&quot; title=&quot;Resuming tenant oplog applier due to recipient failover can miss writing no-op entries for donor oplog entries.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-72622&quot;&gt;&lt;del&gt;SERVER-72622&lt;/del&gt;&lt;/a&gt;: Track TenantOplogApplier progress in replicated collection&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/3c130a69eaddc7cb44895f57af4da6e39556dbb4&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/3c130a69eaddc7cb44895f57af4da6e39556dbb4&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10420">
                    <name>Backports</name>
                                            <outwardlinks description="backported by">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="2227891">SERVER-72623</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="2312232">SERVER-75990</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="440152" name="SERVER-72622.patch" size="11609" author="suganthi.mani@mongodb.com" created="Thu, 30 Mar 2023 06:54:15 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 9 Jan 2023 16:12:16 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        42 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>suganthi.mani@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            42 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>173.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>christopher.caplinger@mongodb.com</customfieldvalue>
            <customfieldvalue>didier.nadeau@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>suganthi.mani@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i1pphr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i188kw:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="6907">Server Serverless 2023-02-06</customfieldvalue>
    <customfieldvalue id="6946">Server Serverless 2023-02-20</customfieldvalue>
    <customfieldvalue id="7086">Server Serverless 2023-03-06</customfieldvalue>
    <customfieldvalue id="7087">Server Serverless 2023-03-20</customfieldvalue>
    <customfieldvalue id="7088">Server Serverless 2023-04-03</customfieldvalue>
    <customfieldvalue id="7292">Server Serverless 2023-04-17</customfieldvalue>
    <customfieldvalue id="7293">Server Serverless 2023-05-01</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i1pbn3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>