<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:47:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-81782] Waiting for journalling is redundant with waiting for replication for replicated writes</title>
                <link>https://jira.mongodb.org/browse/SERVER-81782</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;See &lt;a href=&quot;https://docs.google.com/document/d/1TaN4VhYHU6Abpc2ccvAn65Sa9Im8ZtDymX7Ol6PwiJs/edit&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this doc&lt;/a&gt; for a more detailed description of how waiting for journaling works today.&lt;/p&gt;

&lt;p&gt;Right now, for (implicit and explicit) &lt;tt&gt;j:1&lt;/tt&gt; writes we wait for durability on the current node (which also includes waiting for no oplog holes if voters=1) prior to waiting for replication. However, the mechanism used for waiting for replication is redundant with the mechanism for waiting for durability now that the repl and topo coordinators track the durable point for all nodes including themselves. We should therefore skip waiting for durability locally and just go straight to waiting for replication.&lt;/p&gt;

&lt;p&gt;There are a few things to be careful with when doing this:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;We currently &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b54ebb4b88dd89b6155122ade53f5d8b6ecc17f3/src/mongo/db/write_concern.cpp#L330-L334&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;skip&lt;/a&gt; waiting for replication for &lt;tt&gt;w:1&lt;/tt&gt; writes because we aren&apos;t waiting for other nodes.&lt;/li&gt;
	&lt;li&gt;We need to decide what semantics we want for j:1 writes. Currently it requires that it is durable on the current node. While we &lt;em&gt;can&lt;/em&gt; preserve that behavior by changing what we consider ready to confirm in the awaitReplication checks, I&apos;m not sure what we want here. At least for &lt;tt&gt;w:majority, j:1&lt;/tt&gt; writes, it seems reasonable to return once it is durable on &lt;em&gt;any&lt;/em&gt; set of nodes that constitutes a majority, even if it doesn&apos;t include the current primary. For &lt;tt&gt;w:1, j:1&lt;/tt&gt; I could imagine someone expecting it to be durable on the current node, but I don&apos;t think that is really meaningful in our model.&lt;/li&gt;
	&lt;li&gt;We need to be careful if the command does any non-replicated writes either after replicated writes or if there are no replicated writes. The &lt;tt&gt;awaitReplication&lt;/tt&gt; logic can only wait for a specific optime, but non-replicated writes may happen after the latest optime. We need to avoid reintroducing a variant of &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-81780&quot; title=&quot;WaitForWriteConcern should not be skipped entirely for local commands&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-81780&quot;&gt;SERVER-81780&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;We need to ensure that something tells the JournalFlusher thread that it should run. That is currently handled by the logic to wait for local journaling, but if we skip that, it won&apos;t happen any more. I POCed kicking that thread from every &lt;tt&gt;WUOW::commit()&lt;/tt&gt; right next to where we currently &lt;a href=&quot;https://github.com/mongodb/mongo/blob/b54ebb4b88dd89b6155122ade53f5d8b6ecc17f3/src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp#L378-L389&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;kick the OplogVisibilityThread&lt;/a&gt;. That should work fine as long as we carefully use atomics so that we only need to acquire the JournalFlusher&apos;s mutex once by a single thread each time it loops. Otherwise it risks becoming an additional contention point. I&apos;m not sure if we want to do this on secondaries or just the primary.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Note that for most user-initiated writes (which tend to be replicated), the &lt;tt&gt;awaitReplication&lt;/tt&gt; logic does a better job of waiting for journaling than the logic that waits for durability:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The durability wait first waits for there to be no oplog holes. However it does this by checking the cache maintained by the OplogVisibilityThread rather than asking WT directly, and because waiters don&apos;t tap the cv (either intentionally or as an oversight), that thread will wait up to 1ms or until the next `WUOW::commit()` which is a real problem for single-threaded writers.
	&lt;ul&gt;
		&lt;li&gt;This is only done when there is a single voter in the replica set. Because of the oplogTruncatePoint, we &lt;b&gt;really&lt;/b&gt; should be doing this for all replica sets. However, because we also check this in &lt;tt&gt;awaitReplication&lt;/tt&gt;, we are ok.&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;Both the no holes wait and the wait for journaling wait for the next pass of their respecitive threads after &quot;now&quot; rather than waiting for after the optime of the operation. This is obviously wrong if the write happens to be durable by the time we reach the wait, but it is also problematic because it will wait for any other ops that happen to come in in parallel before we reach that point. And if there is a single voter, the &quot;now&quot; point used for waiting for journalling is &lt;b&gt;after&lt;/b&gt; we have waited for no holes, so is likely to include even more unrelated writes.&lt;/li&gt;
	&lt;li&gt;When we need to wait for replication, it is likely (but not guaranteed) that the replication will complete after the local journaling. This means that our thread may wait and wake up 2 or 3 times in the process of waiting for write concern while if we centralize all waiting in &lt;tt&gt;awaitReplication&lt;/tt&gt; (which perhaps should be renamed) then we will only need to wait/wake once, which reduces needless context switches.&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment></environment>
        <key id="2457093">SERVER-81782</key>
            <summary>Waiting for journalling is redundant with waiting for replication for replicated writes</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="wenbin.zhu@mongodb.com">Wenbin Zhu</assignee>
                                    <reporter username="mathias@mongodb.com">Mathias Stearn</reporter>
                        <labels>
                            <label>perf-8.0</label>
                            <label>perf-tiger</label>
                            <label>perf-tiger-handoff</label>
                            <label>perf-tiger-q4</label>
                            <label>repl-shortlist</label>
                    </labels>
                <created>Tue, 3 Oct 2023 10:10:20 +0000</created>
                <updated>Thu, 1 Feb 2024 19:32:07 +0000</updated>
                            <resolved>Tue, 30 Jan 2024 22:41:13 +0000</resolved>
                                                    <fixVersion>8.0.0-rc0</fixVersion>
                                                        <votes>0</votes>
                                    <watches>14</watches>
                                                                                                                <comments>
                            <comment id="6054201" author="xgen-internal-githook" created="Tue, 30 Jan 2024 22:17:47 +0000"  >&lt;p&gt;Author: &lt;/p&gt;
{&apos;name&apos;: &apos;Wenbin Zhu&apos;, &apos;email&apos;: &apos;wenbin.zhu@mongodb.com&apos;, &apos;username&apos;: &apos;WenbinZhu&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-81782&quot; title=&quot;Waiting for journalling is redundant with waiting for replication for replicated writes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-81782&quot;&gt;&lt;del&gt;SERVER-81782&lt;/del&gt;&lt;/a&gt; Do not wait for journal flush completion for replicated writes. (#18401)&lt;/p&gt;

&lt;p&gt;GitOrigin-RevId: b4687beeb602b45dd9cda2845e837eb14536c406&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/b5b058c61209de1d167888e3c39ae445abcdab6c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/b5b058c61209de1d167888e3c39ae445abcdab6c&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="2457076">SERVER-81780</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25128"><![CDATA[Replication]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 9 Oct 2023 18:20:22 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        1 week, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>sviatlana.zuiko@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            1 week, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>135.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>xgen-internal-githook</customfieldvalue>
            <customfieldvalue>mathias@mongodb.com</customfieldvalue>
            <customfieldvalue>wenbin.zhu@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i2t3ef:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i2b0mc:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_22250" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Special Downgrade Instructions Required</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="23343"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="7968">Repl 2023-12-25</customfieldvalue>
    <customfieldvalue id="7969">Repl 2024-01-08</customfieldvalue>
    <customfieldvalue id="7971">Repl 2024-02-05</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i2spjr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>