<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:39:28 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-35316] Causal Consistency Violation for local read/write concerns in Jepsen Sharded Cluster Test</title>
                <link>https://jira.mongodb.org/browse/SERVER-35316</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Original Problem:&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;When running a test against 2 replica set shards, with rc &#8220;local&#8221; and wc &#8220;w1&#8221; we get reads returning the base value of the document, nil, despite occurring after acknowledged writes in the session. Each single threaded client is writing to one key at a time, using one session, against a single mongos router, and does not have writes to secondaries enabled. The nemesis partitions the network into random halves for 10 seconds, with a 10 second wait in between (This failure has not appeared with partitions disabled.&lt;/p&gt;

&lt;p&gt;The expected pattern of operations in this test is read nil/0, write 1, read 1, write 2, read 2. In the test histories I&#8217;ve attached below, the :position field is the op&#8217;s optime value, read from the session after acknowledgement. It&#8217;s :link field is the previous optime value the client has seen in that key&#8217;s session.&#160;&lt;/p&gt;

&lt;p&gt;In the first set of results (rwr-initial-read-1), this occurs in 3 keys over a 40 second test. In each failing key (15 23 16), we see a read of 0 (representing the initial empty document&#8217;s read nil for the checker), and a successful write of 1. Then the read following write 2 returns an empty&#160;document.&#160;&#160;&lt;/p&gt;

&lt;p&gt;The second result set&#160;(rwr-initial-read-2) provided is a longer test over 300 seconds, where we observe this anomaly 6 times. Keys 50, 51, 82, 125, and 143 appear to drop the value on write 2. However, key 116 is missing the value for write 1.&#160;&#160;Also of note is that in the history for key 116, (under independent/116) write 2 succeeds and appears in the final read for the key. See op `{:type :ok, :f :read, :value 2, :process 5, :time 205602064887, :position 6559250071353819138, :link 6559250071353819137, :index 1206}`&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</description>
                <environment></environment>
        <key id="552253">SERVER-35316</key>
            <summary>Causal Consistency Violation for local read/write concerns in Jepsen Sharded Cluster Test</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13202">Works as Designed</resolution>
                                        <assignee username="alyson.cabral@mongodb.com">Alyson Cabral</assignee>
                                    <reporter username="cristopher.stauffer@mongodb.com">Cristopher Stauffer</reporter>
                        <labels>
                    </labels>
                <created>Thu, 31 May 2018 16:37:39 +0000</created>
                <updated>Fri, 27 Oct 2023 13:53:45 +0000</updated>
                            <resolved>Thu, 20 Sep 2018 02:31:06 +0000</resolved>
                                                                    <component>Sharding</component>
                                        <votes>0</votes>
                                    <watches>15</watches>
                                                                                                                <comments>
                            <comment id="2008354" author="alyson.cabral" created="Wed, 19 Sep 2018 19:54:04 +0000"  >&lt;p&gt;We&apos;ve created &lt;a href=&quot;https://jira.mongodb.org/browse/DOCS-11866&quot; title=&quot;Document Causal Consistency behavior for different read and write concerns&quot; class=&quot;issue-link&quot; data-issue-key=&quot;DOCS-11866&quot;&gt;&lt;del&gt;DOCS-11866&lt;/del&gt;&lt;/a&gt; and updated &lt;a href=&quot;https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/#causal-consistency&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;the documentation&lt;/a&gt; to recommend read and write concern configurations&#160;that are always safe, even during network partitions.&#160;When used with causal consistency, the combination of read concern majority and write concern majority always safely delivers causal guarantees.   &lt;/p&gt;


&lt;p&gt;This ticket specifically points out an anomaly with using values of write concern less than &apos;majority&apos;.&lt;/p&gt;

&lt;p&gt;Write concern, or write acknowledgment, tells the server how long the client is willing to wait to know the definitive result of that write (i.e. whether the write committed or not). Write concern options are:&lt;/p&gt;

&lt;p&gt;1 &#8211; the write returns a success once it has been applied to the primary&lt;br/&gt;
N &#8211; the write returns a success once it has been applied to N number of nodes&lt;br/&gt;
Majority &#8211; the write returns a success once it has been applied to a majority of nodes&lt;/p&gt;

&lt;p&gt;Only a successful write with write concern majority is committed and guarantees durability to any system failures.&lt;/p&gt;

&lt;p&gt;Let&#8217;s consider the behavior of write concern (WC) during a network partition, when write concern is 1 (WC:1).&lt;/p&gt;

&lt;p&gt;In this example, the causal sequence of operations is as follows:&lt;br/&gt;
At Time T1 perform a write W1 &lt;br/&gt;
At Time T2 perform a read R1 &lt;br/&gt;
 &lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;https://jira.mongodb.org/secure/attachment/196683/196683_diagram1.png&quot; width=&quot;100%&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt; &lt;br/&gt;
In this diagram, P1 has been partitioned from a majority of nodes and P2 has been elected as the new primary. However, P1 does not yet know it isn&#8217;t primary and may continue to accept writes. Once P1 is reconnected to a majority of nodes, all of its writes since the timeline diverged will be rolled back.&lt;/p&gt;

&lt;p&gt;If the&#160;write W1 is using write concern 1, and the&#160;read R1 is using read concern majority, the diagram below represents which timeline the operation can successfully&#160;execute&#160;on.&#160;&lt;br/&gt;
&lt;span class=&quot;image-wrap&quot; style=&quot;&quot;&gt;&lt;img src=&quot;https://jira.mongodb.org/secure/attachment/196686/196686_diagram2.png&quot; width=&quot;100%&quot; style=&quot;border: 0px solid black&quot; /&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;The write W1 with write concern 1 may return successfully when applied to the timeline of either P1 or P2. If we consider the case where the write committed to P1 this means the client will get a success message even if the write may be ultimately rolled back. &lt;/p&gt;

&lt;p&gt;The causal read R1 with read concern majority will wait until the time T1 (or later) becomes majority committed before returning success. P1 is unable to progress it&#8217;s notion of the majority commit point since it is partitioned from a majority of nodes, so any successful read R1 must have executed on the true primary&#8217;s timeline and will see the definitive result of the write. In this case, the definitive result of the write may be that the write did not commit. If R1 sees the result that the write W1 did not commit, this means that the write will never be committed.&lt;/p&gt;

&lt;p&gt;Even with write concerns less than majority, the causal ordering of the committed writes is maintained. However, durability is not guaranteed.&lt;/p&gt;

&lt;p&gt;To learn more about the effects of different combinations of read and write concern on causal guarantees, we&apos;ve created a &lt;a href=&quot;https://docs.mongodb.com/manual/core/causal-consistency-read-write-concerns/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;detailed explanation of behavior&lt;/a&gt; in our documentation. No anomalies exist when using the combination of read concern majority and write concern majority.&lt;/p&gt;</comment>
                            <comment id="1912933" author="mkcp" created="Wed, 6 Jun 2018 20:20:55 +0000"  >&lt;p&gt;I&apos;ve run the causal register test on 4.0-rc1 with various write and read levels, and there&apos;s no apparent change in behavior between 3.6 and 4.0. That is, operations with sub-majority writes may not be observed in future dependent reads.&lt;/p&gt;

&lt;p&gt;In tests with w: majority, I have not found any anomalous histories. If they are possible suggested, then the current causal register test may not be strong enough to find it, and may be worth testing under future work.&lt;/p&gt;

&lt;p&gt;As they stand now, sub-majority writes can violate the base case of causality in the presence of network partitions. It seems insufficient to document in&#160;&lt;a href=&quot;https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/&lt;/a&gt; that any acknowledged write in a CC-enabled session will be causally consistent. Rather, a write must be acknowledged by a majority of nodes for it to be safe. It looks like now, we have dependent writes failing to appear in time for the read, and therefore we&apos;re serving up stale data, despite depending on a successful write in that session. If a read&apos;s dependent writes are not available, you have to block until they are. Though, you have the option to cache written values in the client so they can be served up immediately for dependent reads.&lt;/p&gt;</comment>
                            <comment id="1907024" author="misha.tyulenev" created="Thu, 31 May 2018 17:33:34 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=cristopher.stauffer&quot; class=&quot;user-hover&quot; rel=&quot;cristopher.stauffer&quot;&gt;cristopher.stauffer&lt;/a&gt;  yes for both questions. The results look consistent with the design. &lt;br/&gt;
Here are some ideas on how it can be investigated further:&lt;br/&gt;
using w:1 or level:local will result in different anomalies as some data can disappear.&lt;br/&gt;
It will be interesting to learn the anomalies from the client perspective what is broken:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;monotonic reads&lt;/li&gt;
	&lt;li&gt;monotonic writes&lt;/li&gt;
	&lt;li&gt;read your writes&lt;/li&gt;
	&lt;li&gt;writes follow reads&lt;br/&gt;
(&lt;a href=&quot;https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;w:majority and level:majority will give all four properties.&lt;br/&gt;
relaxing read or write concern will introduce a compromise for a user i.e. w:majority, level:local will likely preserve monotonic writes, and read your writes but will break 2 other properties&lt;/p&gt;</comment>
                            <comment id="1906932" author="cristopher.stauffer" created="Thu, 31 May 2018 16:41:00 +0000"  >&lt;p&gt;Oh behalf of user:&#160;&lt;br/&gt;
&#160;&lt;br/&gt;
I&apos;ve tested with w: w1, and level: majority, and while the behavior is distinct, we still appear to be missing writes. Rather than reading out an empty document, we observe the latest acknowledged write missing (See results.edn in the attached test case).&lt;br/&gt;
&#160;&lt;br/&gt;
The behavior has indeed disappeared when running with w: majority, r: local. I&apos;ll keep digging here.&lt;br/&gt;
&#160;&lt;br/&gt;
Regarding opTime and afterClusterTime in the session, I am depending on the java driver&apos;s behavior. I start a session at the beginning of each key, pass the session to each invocation for that key, and write the session&apos;s optime value to the test&apos;s history after each successful op.&lt;/p&gt;</comment>
                            <comment id="1906928" author="cristopher.stauffer" created="Thu, 31 May 2018 16:38:38 +0000"  >&lt;p&gt;Two Questions:&lt;/p&gt;

&lt;p&gt;The&#160;&lt;em&gt;local&lt;/em&gt;&#160;read can return the data that was rolled back on partition as it was&#160;&lt;em&gt;w1&lt;/em&gt;&#160;write. As such there should be missing values that will satisfying the&#160;&lt;em&gt;afterClusterTime&lt;/em&gt;. Please confirm this is the case? To make this behavior disappear the data has to be&#160;&lt;em&gt;w:majority&lt;/em&gt;&#160;or&#160;&lt;em&gt;level: majority&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Can you send the test design? Specifically to be sure how the&lt;/em&gt;&#160;&lt;em&gt;operationTime&lt;/em&gt;&#160;&lt;em&gt;is connected to the&lt;/em&gt;&#160;&lt;em&gt;afterClusterTime&lt;/em&gt;&#160;&lt;em&gt;in a session.&lt;/em&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                        <issuelink>
            <issuekey id="569264">DOCS-11866</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="196683" name="diagram1.png" size="51433" author="alyson.cabral@mongodb.com" created="Wed, 19 Sep 2018 19:35:49 +0000"/>
                            <attachment id="196686" name="diagram2.png" size="55753" author="alyson.cabral@mongodb.com" created="Wed, 19 Sep 2018 19:44:29 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 31 May 2018 17:33:34 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 years, 21 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 years, 21 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>alyson.cabral@mongodb.com</customfieldvalue>
            <customfieldvalue>cristopher.stauffer@mongodb.com</customfieldvalue>
            <customfieldvalue>mkcp</customfieldvalue>
            <customfieldvalue>misha.tyulenev@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htzg8f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr8zu7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htz2hr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>