<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:15:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-47844] Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true</title>
                <link>https://jira.mongodb.org/browse/SERVER-47844</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Currently, we update the stable timestamp inside &lt;tt&gt;ReplicationCoordinatorImpl::_setStableTimestampForStorage&lt;/tt&gt; by &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b89a546076135f2d6692111ea25a224355cdbd0e/src/mongo/db/repl/replication_coordinator_impl.cpp#L4836-L4837&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;computing the stable optime&lt;/a&gt; from the set of stable optime candidates. We should remove the dependence on the stable optime candidates for &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b89a546076135f2d6692111ea25a224355cdbd0e/src/mongo/db/repl/replication_coordinator_impl.cpp#L4856-L4857&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;setting the stable timestamp&lt;/a&gt; and &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b89a546076135f2d6692111ea25a224355cdbd0e/src/mongo/db/repl/replication_coordinator_impl.cpp#L4854&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;updating the committed snapshot&lt;/a&gt; when enableMajorityReadConcern:true. We should be able to set the stable timestamp for storage directly as &lt;tt&gt;min(all_durable, lastCommittedOpTime)&lt;/tt&gt;. We will not remove any of the logic for computing and updating the stable optime candidates set as a part of this ticket.&lt;/p&gt;</description>
                <environment></environment>
        <key id="1333345">SERVER-47844</key>
            <summary>Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true</summary>
                <type id="3" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14718&amp;avatarType=issuetype">Task</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="william.schultz@mongodb.com">William Schultz</assignee>
                                    <reporter username="william.schultz@mongodb.com">William Schultz</reporter>
                        <labels>
                    </labels>
                <created>Wed, 29 Apr 2020 21:54:52 +0000</created>
                <updated>Sun, 29 Oct 2023 22:08:54 +0000</updated>
                            <resolved>Tue, 30 Jun 2020 20:16:21 +0000</resolved>
                                                    <fixVersion>4.7.0</fixVersion>
                                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="3433034" author="xgen-internal-githook" created="Wed, 7 Oct 2020 20:06:17 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;A. Jesse Jiryu Davis&apos;, &apos;email&apos;: &apos;jesse@mongodb.com&apos;, &apos;username&apos;: &apos;ajdavis&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48518&quot; title=&quot;Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48518&quot;&gt;&lt;del&gt;SERVER-48518&lt;/del&gt;&lt;/a&gt; Fix rollback via refetch anomaly, try 2&lt;/p&gt;

&lt;p&gt;Includes the following partial backports:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Add the ability to get the initialDataTimestamp from the storage engine interface&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 1408e1b8a5392a9001ee598b5cec66afc4e1cf77)&lt;br/&gt;
(cherry picked from commit 329d8c517d8b3c3fb4bcb63eecf6064ac9a007cf)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48518&quot; title=&quot;Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48518&quot;&gt;&lt;del&gt;SERVER-48518&lt;/del&gt;&lt;/a&gt; Fix rollback via refetch anomaly&lt;/p&gt;

&lt;p&gt;(cherry picked from commit eee49c64cdeb8fa95704b9a316b779eb5eb9800c)&lt;br/&gt;
(cherry picked from commit 88c0265e057f0e5581306f294d1ca2bda19760e4)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-50183&quot; title=&quot;Copy _awaitPrimaryAppliedSurpassesRollbackApplied function from RollbackTest to RollbackTestDeluxe&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-50183&quot;&gt;&lt;del&gt;SERVER-50183&lt;/del&gt;&lt;/a&gt; Copy _awaitPrimaryAppliedSurpassesRollbackApplied function from RollbackTest to RollbackTestDeluxe&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 252251d38915b9e6722186b9742cc914a045d589)&lt;br/&gt;
(cherry picked from commit d4b960b5f3f4a7a2b18b48d7fb14251704a8bda8)&lt;br/&gt;
Branch: v4.0&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/ece51101e58dfaf7e455c8c96df6ade42b99515c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/ece51101e58dfaf7e455c8c96df6ade42b99515c&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3380192" author="xgen-internal-githook" created="Tue, 8 Sep 2020 16:34:46 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;William Schultz&apos;, &apos;email&apos;: &apos;william.schultz@mongodb.com&apos;, &apos;username&apos;: &apos;will62794&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48518&quot; title=&quot;Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48518&quot;&gt;&lt;del&gt;SERVER-48518&lt;/del&gt;&lt;/a&gt; Fix rollback via refetch anomaly&lt;/p&gt;

&lt;p&gt;Includes the following partial backports:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Add the ability to get the initialDataTimestamp from the storage engine interface&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 1408e1b8a5392a9001ee598b5cec66afc4e1cf77)&lt;br/&gt;
(cherry picked from commit 329d8c517d8b3c3fb4bcb63eecf6064ac9a007cf)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48518&quot; title=&quot;Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48518&quot;&gt;&lt;del&gt;SERVER-48518&lt;/del&gt;&lt;/a&gt; Fix rollback via refetch anomaly&lt;/p&gt;

&lt;p&gt;(cherry picked from commit eee49c64cdeb8fa95704b9a316b779eb5eb9800c)&lt;br/&gt;
(cherry picked from commit 88c0265e057f0e5581306f294d1ca2bda19760e4)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-50183&quot; title=&quot;Copy _awaitPrimaryAppliedSurpassesRollbackApplied function from RollbackTest to RollbackTestDeluxe&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-50183&quot;&gt;&lt;del&gt;SERVER-50183&lt;/del&gt;&lt;/a&gt; Copy _awaitPrimaryAppliedSurpassesRollbackApplied function from RollbackTest to RollbackTestDeluxe&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 252251d38915b9e6722186b9742cc914a045d589)&lt;br/&gt;
(cherry picked from commit d4b960b5f3f4a7a2b18b48d7fb14251704a8bda8)&lt;br/&gt;
Branch: v4.0&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/b07f80de5850c665e75dc259def6b8999d1077dd&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/b07f80de5850c665e75dc259def6b8999d1077dd&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3356872" author="xgen-internal-githook" created="Tue, 25 Aug 2020 02:47:17 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;William Schultz&apos;, &apos;email&apos;: &apos;william.schultz@mongodb.com&apos;, &apos;username&apos;: &apos;will62794&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Add the ability to get the initialDataTimestamp from the storage engine interface&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 1408e1b8a5392a9001ee598b5cec66afc4e1cf77)&lt;br/&gt;
(cherry picked from commit 329d8c517d8b3c3fb4bcb63eecf6064ac9a007cf)&lt;br/&gt;
Branch: v4.2&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/b4d587314e3f8bc9cfb800a9ede40349756dfd86&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/b4d587314e3f8bc9cfb800a9ede40349756dfd86&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3350147" author="jesse" created="Thu, 20 Aug 2020 23:10:24 +0000"  >&lt;p&gt;I cherry-picked the first half of the changes for this ticket to 4.4, since it was a prerequisite for backporting &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-48518&quot; title=&quot;Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-48518&quot;&gt;&lt;del&gt;SERVER-48518&lt;/del&gt;&lt;/a&gt; to 4.4. We&apos;re not going to backport the rest of the changes for this ticket.&lt;/p&gt;</comment>
                            <comment id="3350142" author="xgen-internal-githook" created="Thu, 20 Aug 2020 23:07:19 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;William Schultz&apos;, &apos;email&apos;: &apos;william.schultz@mongodb.com&apos;, &apos;username&apos;: &apos;will62794&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Add the ability to get the initialDataTimestamp from the storage engine interface&lt;/p&gt;

&lt;p&gt;(cherry picked from commit 1408e1b8a5392a9001ee598b5cec66afc4e1cf77)&lt;br/&gt;
Branch: v4.4&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/329d8c517d8b3c3fb4bcb63eecf6064ac9a007cf&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/329d8c517d8b3c3fb4bcb63eecf6064ac9a007cf&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3278150" author="william.schultz" created="Thu, 9 Jul 2020 18:23:40 +0000"  >&lt;p&gt;After the changes from this ticket, when enableMajorityReadConcern:true, we no longer use the stable optime candidates list at all to compute the stable timestamp or the committed snapshot. Instead, the &lt;tt&gt;&lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L4948&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ReplicationCoordinatorImpl::_recalculateStableOpTime&lt;/a&gt;&lt;/tt&gt; method &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L4996&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;computes the stable optime&lt;/a&gt; as the minimum of the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L4985&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;no-overlap&lt;/a&gt; point and the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L4992&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;lastCommittedOpTime&lt;/a&gt;. The no-overlap point is a timestamp that can be guaranteed to be &quot;consistent&quot; on both primaries and secondaries, and the concept already has a &lt;a href=&quot;https://github.com/mongodb/mongo/blob/fdd66b8c2c0f5c818a01ecd64a27b0b7c0ca28d6/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp#L617-L641&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;precedent in the storage layer&lt;/a&gt;. The definition of &quot;consistent&quot; here is a bit vague, but, generally, we need a timestamp T with a guarantee that no future transactions will commit at timestamps less than T. &lt;/p&gt;

&lt;p&gt;This is necessary because we need the stable timestamp and the committed snapshot to be safe for readers, since they are both timestamps used for reading data. We want those readers to see a view of data that is consistent with the oplog i.e. if they read at timestamp T the data returned should reflect all operations that have been applied in the oplog up to T. There are unsafe windows of timestamps on both primary and secondary where this property is violated. On primary, it occurs because we assign timestamps to concurrent transactions in an order that may be different from their commit order, leading to the creation of oplog &quot;holes&quot;. On secondaries, the mechanisms of parallel batch application make it so that the data on disk during batch application may not reflect the oplog accurately, since the application of ops occurs in parallel and is non deterministic. So, the no-overlap timestamp provides a unified notion of what timestamps are safely &quot;visible&quot; for timestamp readers to look at. Beyond that timestamp, we may be in an undefined frontier, where reads may return incorrect or inconsistent data. The no-overlap point is thus computed as the minimum of the allDurable, which provides a safe timestamp on primaries, and the lastApplied, which provides a safe timestamp on secondaries, since it prevents us from ever looking beyond the end of the last complete secondary batch.&lt;/p&gt;

&lt;p&gt;In addition to the changes in how we compute the stable optime, we also added a few extra conditions around when we should avoid updating the stable timestamp, mainly related to initial sync. Previously, we would not add optimes to the stable optime candidates list during initial sync, which would prevent us from setting the stable timestamp or committed snapshot behind the oldest timestamp or the initialDataTimestamp. After removal of the candidates list, though, we need to have explicit checks that we &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L5035-L5049&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;don&apos;t try to set the stable timestamp behind the initialDataTimestamp&lt;/a&gt; after coming out of initial sync and that we &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L5019-L5030&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;don&apos;t set it during initial sync&lt;/a&gt;, since that would run the risk that we set it behind the oldest timestamp.&lt;/p&gt;

&lt;p&gt;Note that we are stuck using optimes for now when computing the stable optime, because the value is also used to set the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/1b0445dc3ea2a3d15ae477238f68b0a4438a7212/src/mongo/db/repl/replication_coordinator_impl.h#L1578-L1581&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;_currentCommittedSnapshot&lt;/a&gt;, which is an optime. Eventually, though, it would be ideal to convert all of these values to timestamps only, since for storage engine reads (and local optime comparison), timestamps are sufficient. That will likely require a slightly larger refactor, though, since several other parts of the system are wired to use optimes e.g. the write concern &lt;a href=&quot;https://github.com/mongodb/mongo/blob/a6cd89a6c4d39b4b21376b109ad335e2fda8fb5d/src/mongo/db/repl/replication_coordinator_impl.cpp#L5563&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;notification logic&lt;/a&gt; and the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/eae31861e0f813f0099e1d490c4a622d75cd5a08/src/mongo/db/s/sharding_config_optime_gossip.cpp#L51-L54&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;configOpTime&lt;/a&gt; in sharding. I don&apos;t think these are fundamental impediments, but they probably require some more careful thought on how to re-organize things to care only about timestamps.&lt;/p&gt;</comment>
                            <comment id="3230820" author="xgen-internal-githook" created="Tue, 30 Jun 2020 18:50:59 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;William Schultz&apos;, &apos;email&apos;: &apos;william.schultz@mongodb.com&apos;, &apos;username&apos;: &apos;will62794&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Set the stable timestamp without using the stable optime candidates when enableMajorityReadConcern:true&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/02020fa91c62562cb08f30c8130baf0791cc0a67&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/02020fa91c62562cb08f30c8130baf0791cc0a67&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3230819" author="xgen-internal-githook" created="Tue, 30 Jun 2020 18:50:46 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;William Schultz&apos;, &apos;email&apos;: &apos;william.schultz@mongodb.com&apos;, &apos;username&apos;: &apos;will62794&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-47844&quot; title=&quot;Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-47844&quot;&gt;&lt;del&gt;SERVER-47844&lt;/del&gt;&lt;/a&gt; Add the ability to get the initialDataTimestamp from the storage engine interface&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/1408e1b8a5392a9001ee598b5cec66afc4e1cf77&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/1408e1b8a5392a9001ee598b5cec66afc4e1cf77&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="3204275" author="judah.schvimer" created="Thu, 11 Jun 2020 00:00:27 +0000"  >&lt;blockquote&gt;
&lt;p&gt;Avoiding stable optime updates during a non maintenance RECOVERING state may suffice to avoid setting the stable timestamp at an inconsistent point. &lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I don&apos;t think this prevents setting the stable timestamp to an inconsistent timestamp immediately following leaving RECOVERING, after reaching minValid, but before minValid is majority committed.&lt;/p&gt;</comment>
                            <comment id="3198003" author="william.schultz" created="Tue, 9 Jun 2020 19:47:38 +0000"  >&lt;p&gt;Adding some notes on behavior of the stable timestamp in different states that I encountered while testing out some changes for this ticket. This discusses details of the enableMajorityReadConcern:true case.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Initial Sync&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;During initial sync, we &lt;a href=&quot;https://github.com/mongodb/mongo/blob/eae31861e0f813f0099e1d490c4a622d75cd5a08/src/mongo/db/repl/initial_syncer.cpp#L517-L518&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;take unstable checkpoints&lt;/a&gt; so the stable timestamp isn&apos;t really functionally important. Once we complete initial sync, we &lt;a href=&quot;https://github.com/mongodb/mongo/blob/eae31861e0f813f0099e1d490c4a622d75cd5a08/src/mongo/db/repl/initial_syncer.cpp#L542-L563&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;set our initialDataTimestamp&lt;/a&gt; to our lastApplied optime after finishing oplog application. Note that, throughout initial sync, we also &lt;a href=&quot;https://github.com/mongodb/mongo/blob/9b1ed5d3acb9e38a0ba53e1edd15f8c377a07312/src/mongo/db/repl/replication_coordinator_impl.cpp#L1439-L1442&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;continuously update&lt;/a&gt; our oldest_timestamp so we don&apos;t pin too much data in memory. Thus, we must impose certain restrictions on how we set the stable timestamp during and after leaving initial sync. In the current system, we are able to avoid setting our stable timestamp earlier than the initialDataTimestamp post initial sync by &lt;a href=&quot;https://github.com/mongodb/mongo/blob/eae31861e0f813f0099e1d490c4a622d75cd5a08/src/mongo/db/repl/initial_syncer.cpp#L1565&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;not adding optimes&lt;/a&gt; to the candidates list during initial sync. Without a stable candidates list, however, we still need to ensure that we don&apos;t set our stable timestamp behind the initialDataTimestamp. Similarly, we must also not set the stable timestamp behind the oldest_timestamp. These are invariants enforced at the storage layer. &lt;/p&gt;

&lt;p&gt;To prevent setting stableTimestamp &amp;lt; initialDataTimestamp after initial sync, we can explicitly disallow stable timestamp updates that are earlier than our current initialDataTimestamp. To prevent setting stableTimestamp &amp;lt; oldest_timestamp during initial sync, we can prevent stable timestamp updates entirely during initial sync i.e. STARTUP2 state. Note that we might be updating our lastCommittedOpTime during initial sync (since other nodes might be committing writes), so if we didn&apos;t explicitly disallow these updates, we might go on updating our stable timestamp during or after coming out of initial sync incorrectly. To summarize this case, these are the two important invariants we need to ensure are satisfied during initial sync and after leaving it:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;stable_timestamp &amp;gt;= oldest_timestamp&lt;/tt&gt;&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;stable_timestamp &amp;gt;= initialDataTimestamp&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;b&gt;Secondary&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;One of the original motivations for the stable optime candidates list was to ensure that we didn&apos;t set the stable timestamp to a timestamp that falls in the middle of a secondary oplog application batch. We enforce this by only adding optimes at secondary batch boundaries  to the candidate set. It is now safe to set the stable timestamp to a timestamp in the middle of a batch, but we still need to be careful to not set the stable timestamp in the middle of a batch &lt;em&gt;while&lt;/em&gt; it is being applied. For example, consider a secondary that is applying a batch containing ops with times &lt;span class=&quot;error&quot;&gt;&amp;#91;1,2,3&amp;#93;&lt;/span&gt;, and a node&apos;s lastCommittedOpTime has already advanced to time 3. Since we apply oplog entries in parallel, we might apply op at time 3, advance our stable timestamp to 3 (since it is &amp;lt;= our commit point), then try to write the oplog entry at time 2, which would violate the invariant that we do not commit a storage transaction at a timestamp behind the stable timestamp. To avoid this, we can constrain the stable timestamp to not surpass our current lastApplied optime, which will be set at the previous fully completed batch boundary. This prevents us from advancing the stable timestamp ahead of secondary batch writes that are at an earlier timestamp and have not yet completed. We can roughly summarize this case with one of the important invariants we need to uphold during secondary batch application:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;commit_timestamp &amp;gt;= stable_timestamp&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;b&gt;Primary&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;The behavior of the stable timestamp on primary is fairly straightforward i.e. in most cases we can just set it to the lastCommittedOpTime directly. In replica sets with &amp;gt; 1 voting nodes, the ops cannot be replicated before they are behind the allDurable timestamp, so the lastCommittedOpTime is always constrained to be behind allDurable. In single voting node replica sets, however, the lastCommittedOpTime will be updated to whatever our lastApplied is, so it may be ahead of the allDurable timestamp. We do not want to set our stable timestamp to an &quot;inconsistent&quot; timestamp &amp;gt; allDurable, however, so we need to constrain the stable timestamp to be no greater than allDurable on primaries i.e. the minimum of lastCommitted and allDurable. We can roughly summarize this case with these invariants (the second of which must hold true in any state):&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;stable_timestamp &amp;lt;= all_durable&lt;/tt&gt;&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;stable_timestamp &amp;lt;= lastCommitted&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;b&gt;Recovering&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;When we have not yet reached minValid, we do &lt;a href=&quot;https://github.com/mongodb/mongo/blob/bd74e903563e9c12bbaf3a027a8dc39ae6a1613d/src/mongo/db/repl/oplog_applier_impl.cpp#L519-L528&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;not add optimes to the stable candidates&lt;/a&gt; list, because they may be at &quot;inconsistent&quot; points. We want to address this issue when we remove the candidates list. Avoiding stable optime updates during a non maintenance RECOVERING state may suffice to avoid setting the stable timestamp at an inconsistent point. However, it may be possible that we are in RECOVERING, have not reached minValid, and are also in maintenance mode, so we may need a way to detect explicitly if we have reached a consistent optime.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Startup Recovery&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;There was an &lt;a href=&quot;https://github.com/mongodb/mongo/blob/c277251fb7bce31be749a236a07bdc7b3412b173/src/mongo/db/repl/replication_recovery.cpp#L429-L433&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;invariant in place&lt;/a&gt; that &lt;a href=&quot;https://mongodbcr.appspot.com/184800001/diff/80001/src/mongo/db/repl/replication_recovery.cpp&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;dates back to 4.0&lt;/a&gt;, which verified that, if we have an appliedThrough value at startup, then it should be equal to the stable timestamp, since the &lt;a href=&quot;https://github.com/mongodb/mongo/blob/bd74e903563e9c12bbaf3a027a8dc39ae6a1613d/src/mongo/db/repl/oplog_applier_impl.cpp#L503-L504&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;appliedThrough is set&lt;/a&gt; to the last optime of a batch after it is applied, and we only set the stable timestamp on secondaries at batch boundaries. With removal of the stable optime candidates list, this will no longer be true, so it should be reasonable to remove this invariant. We can roughly summarize this case with the following invariant, which no longer must hold true:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;tt&gt;stable_timestamp &#8712; batch_boundaries&lt;/tt&gt; (no longer true)&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="3062627" author="william.schultz" created="Thu, 30 Apr 2020 15:15:47 +0000"  >&lt;p&gt;To make commits and reviews smaller, we can likely do this separately for EMRC=true and EMRC=false. Since we won&apos;t be removing the supporting logic for updating the stable optime candidates yet, it should be fine to temporarily have EMRC=true not use the stable optime candidates set while EMRC=false is still using the stable timestamp optime candidates set.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                        <issuelink>
            <issuekey id="1385932">SERVER-49006</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="1368137">SERVER-48518</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1333350">SERVER-47845</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                        <issuelink>
            <issuekey id="1407930">SERVER-49472</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1652773">SERVER-55305</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1397971">SERVER-49221</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1403661">SERVER-49355</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1405247">SERVER-49406</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="509425">SERVER-33806</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="414627">SERVER-30577</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="497237">SERVER-33292</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="1394678">SERVER-49167</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 11 Jun 2020 00:00:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        3 years, 18 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[<s><a href='https://jira.mongodb.org/browse/SERVER-49006'>SERVER-49006</a></s>]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-1713</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            3 years, 18 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>15.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>jesse@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>judah.schvimer@mongodb.com</customfieldvalue>
            <customfieldvalue>william.schultz@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hxiez3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hx5stz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="3882">Repl 2020-05-18</customfieldvalue>
    <customfieldvalue id="3934">Repl 2020-06-01</customfieldvalue>
    <customfieldvalue id="3935">Repl 2020-06-15</customfieldvalue>
    <customfieldvalue id="3999">Repl 2020-06-29</customfieldvalue>
    <customfieldvalue id="4033">Repl 2020-07-13</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hxi18f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>