<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:41:34 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-35952] Secondaries can fall off the oplog even if necessary oplog entries exist in cluster</title>
                <link>https://jira.mongodb.org/browse/SERVER-35952</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;When evaluating sync sources, we only consider current state and lag, not which oplog entries each candidate has. This can lead to situations where a chained secondary falls off the oplog when switching sync sources (due to sync source re-evaluation) even though the necessary oplog entries exist in the replica set as a whole.&lt;/p&gt;</description>
                <environment></environment>
        <key id="567428">SERVER-35952</key>
            <summary>Secondaries can fall off the oplog even if necessary oplog entries exist in cluster</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13202">Works as Designed</resolution>
                                        <assignee username="backlog-server-repl">Backlog - Replication Team</assignee>
                                    <reporter username="james.kovacs@mongodb.com">James Kovacs</reporter>
                        <labels>
                    </labels>
                <created>Tue, 3 Jul 2018 17:51:36 +0000</created>
                <updated>Fri, 27 Oct 2023 13:53:40 +0000</updated>
                            <resolved>Mon, 29 Jul 2019 17:11:57 +0000</resolved>
                                                                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>9</watches>
                                                                                                                <comments>
                            <comment id="1938393" author="james.kovacs" created="Tue, 3 Jul 2018 21:10:26 +0000"  >&lt;p&gt;It wasn&apos;t clear from the log entries that it was actually using that lagging secondary as a sync source in between the re-evaluation attempts. It appeared to be looping through sync sources repeatedly until it fell off the oplog. What actually happened - when I cross-referenced the logs from the other members - is that they were all looping through trying to find a better sync source, selecting the lagging sync source (because they were too stale to sync from the primary directly), and replicating from it until the next re-evaluation attempt. When the lagging sync source fell off the oplog, all secondaries in the remote DC fell off together. I don&apos;t think there is anything better we could do in such a situation. We can close this as &quot;By Design&quot;.&lt;/p&gt;</comment>
                            <comment id="1938142" author="milkie" created="Tue, 3 Jul 2018 18:31:28 +0000"  >&lt;p&gt;That part that Judah wrote is true, but we actually make two passes through the candidate list: one with the restrictions, and a second one without the restrictions, in case no candidate came out of the first pass.  See comment at &lt;a href=&quot;https://github.com/mongodb/mongo/blob/57d7938c49da06122d4d43054ff89e1881d0209f/src/mongo/db/repl/topology_coordinator.cpp#L317&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/blob/57d7938c49da06122d4d43054ff89e1881d0209f/src/mongo/db/repl/topology_coordinator.cpp#L317&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="1938124" author="james.kovacs" created="Tue, 3 Jul 2018 18:18:11 +0000"  >&lt;p&gt;From &lt;a href=&quot;https://github.com/mongodb/mongo/wiki/Replication-Internals#sync-source-selection&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Sync Source Selection&lt;/a&gt; in &lt;a href=&quot;https://github.com/mongodb/mongo/wiki/Replication-Internals&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;Replication Internals&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;First the secondary checks the TopologyCoordinator&apos;s cached view of the replica set for the latest OpTime known to be on the primary. Secondaries do not sync from nodes whose newest oplog entry is more than maxSyncSourceLagSecs seconds behind the primary&apos;s newest oplog entry.&lt;/p&gt;&lt;/blockquote&gt;</comment>
                            <comment id="1938082" author="milkie" created="Tue, 3 Jul 2018 17:58:09 +0000"  >&lt;p&gt;I&apos;m confused about step 6 in your repro steps.  I didn&apos;t think we had code logic that did that?  Certainly, if the node thinks itself is lagged by more than 30 seconds, it will rescan sync source candidates, but I didn&apos;t think we blacklisted any candidates due to their perceived lag.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25128"><![CDATA[Replication]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[500A000000ar3WgIAI]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 3 Jul 2018 17:58:09 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 years, 32 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 years, 32 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-repl</customfieldvalue>
            <customfieldvalue>milkie@mongodb.com</customfieldvalue>
            <customfieldvalue>james.kovacs@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hu1xzj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|htsovb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;p&gt;&lt;tt&gt;Consider a replica set chaining as follows:&lt;/tt&gt;&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;P -&amp;gt; S1 -&amp;gt; S2&lt;/tt&gt;&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Consider 4 points in time:&lt;/tt&gt;&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;t1 -&amp;gt; t2 -&amp;gt; t3 -&amp;gt; t4&lt;/tt&gt;&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Current time is &lt;tt&gt;t4&lt;/tt&gt;.&lt;/li&gt;
	&lt;li&gt;Primary &lt;tt&gt;P&lt;/tt&gt;&#160;has oplog entries back to time &lt;tt&gt;t3&lt;/tt&gt;.&lt;/li&gt;
	&lt;li&gt;Secondary &lt;tt&gt;S1&lt;/tt&gt;&#160;has oplog entries further back to &lt;tt&gt;t1&lt;/tt&gt; due to replication lag.&lt;/li&gt;
	&lt;li&gt;Secondary &lt;tt&gt;S2&lt;/tt&gt; requires oplog entries back to &lt;tt&gt;t2&lt;/tt&gt;.&lt;/li&gt;
	&lt;li&gt;Secondary &lt;tt&gt;S2&lt;/tt&gt;&#160;realizes its sync source &lt;tt&gt;S1&lt;/tt&gt;&#160;is lagged by more than 30 seconds and re-evaluates sync sources.&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;S2&lt;/tt&gt; disqualifies &lt;tt&gt;S1&lt;/tt&gt; from consideration because it is more than 30 seconds behind.&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;S2&lt;/tt&gt; considers &lt;tt&gt;P&lt;/tt&gt; but disqualifies it because it only has oplog entries back to &lt;tt&gt;t3&lt;/tt&gt;.&lt;/li&gt;
	&lt;li&gt;&lt;tt&gt;S2&lt;/tt&gt; cannot find a valid sync source and falls off the oplog even though &lt;tt&gt;S1&lt;/tt&gt; has the oplog entries it requires to catch up.&lt;/li&gt;
&lt;/ol&gt;
</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hu1k8v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>