<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:30:27 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-32513] Initial sync unnecessarily throws away oplog entries</title>
                <link>https://jira.mongodb.org/browse/SERVER-32513</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Initial sync follows the following phases:&lt;br/&gt;
1) Get the &quot;initial sync begin timestamp&quot; (B). In 4.0 and earlier this is the most recent oplog entry on the sync source. In 4.2 we will get both the &quot;initial sync fetch begin timestamp&quot; (Bf) which will equal the oldest active transaction oplog entry on the sync source and the &quot;initial sync apply begin timestamp&quot;  (Ba) which will be the most recent oplog entry on the sync source.&lt;br/&gt;
2) Start fetching oplog entries from B (or in 4.2 the Bf). Whenever an oplog entry is fetched, it is inserted into an uncapped local collection.&lt;br/&gt;
3) Clone all data, simultaneously creating indexes as we clone each collection&lt;br/&gt;
4) Get the &quot;initial sync end timestamp&quot; (E), which will be the most recent oplog entry on the sync source &lt;br/&gt;
5) Start applying oplog entries from B in 4.0 or earlier or Ba in 4.2+. When applying an oplog entry, it also gets written into the real, capped oplog.&lt;br/&gt;
6) As we apply oplog entries, if we try to apply an update but do not have a local version of the document to update we fetch that document from the sync source, and get a new &quot;initial sync end timestamp&quot; by fetching the most recent oplog entry on the sync source again.&lt;br/&gt;
7) Stop both fetching (we&apos;ve been fetching this entire time, and have generally fetched much more oplog than is necessary, say the last oplog entry fetched was at time F, such that F&amp;gt;E) and applying when we apply up to the most recently set value for E (&quot;initial sync end timestamp&quot;).&lt;br/&gt;
8)  Drop the uncapped local collection&lt;br/&gt;
9) Leave initial sync and begin fetching from E&lt;/p&gt;

&lt;p&gt;At the end of initial sync, the extra oplog entries in our oplog buffer  (from E to F above), are simply thrown away instead of transferring them to the steady state oplog buffer. By beginning fetching immediately and buffering fetched oplog entries in a collection only capped by the size of the disk on the initial syncing node, the initial sync itself should almost never fail due to falling off the back of the sync source oplog. This would only ever happen if the sync source was writing to the oplog faster than the initial syncing node could fetch oplog entries and write them to a local collection without even applying them. &lt;/p&gt;

&lt;p&gt;However, consider if we fetch E at wall-clock time A1 and complete initial sync at time A2 (so we fetch F at time A2). We then throw away all oplog entries that we fetched from E to F between wall-clock times A1 to A2. We then have to refetch all oplog entries from E to F. Thus at wall-clock time A2 we must be able to fetch oplog entry E when the sync source has written all of the way to F already. This means that if in between wall-clock times A1 and A2, the sync source rolled over its oplog and threw away E for being too old, the initial syncing node will be unable to fetch from its sync source immediately after leaving initial sync. As a result, the minimum amount of oplog required is E to F in this case, or the amount of oplog written between A1 and A2 in terms of wall-clock time if the oplog is growing at a steady rate.  Since this rate is pretty hard to calculate and that would be cutting it close, some sync source oplog size significantly larger than E-F is advisable.&lt;/p&gt;

&lt;p&gt;Now that the storage engine allows us to truncate the oldest oplog entries asynchronously when we&apos;re ready (rather than mmap which truly had a fixed size), we are able to write all oplog entries into the real, capped oplog during initial sync by instructing the storage engine to ignore the cap during initial sync, and then slowly shrink the oplog back to its desired size as we apply  oplog entries and catch up to the primary.&lt;/p&gt;</description>
                <environment></environment>
        <key id="477784">SERVER-32513</key>
            <summary>Initial sync unnecessarily throws away oplog entries</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="2">Won&apos;t Fix</resolution>
                                        <assignee username="backlog-server-repl">Backlog - Replication Team</assignee>
                                    <reporter username="judah.schvimer@mongodb.com">Judah Schvimer</reporter>
                        <labels>
                    </labels>
                <created>Tue, 2 Jan 2018 16:22:53 +0000</created>
                <updated>Tue, 22 Aug 2023 14:20:45 +0000</updated>
                            <resolved>Tue, 22 Aug 2023 14:20:45 +0000</resolved>
                                                                    <component>Replication</component>
                                        <votes>10</votes>
                                    <watches>39</watches>
                                                                                                                <comments>
                            <comment id="1823935" author="spencer" created="Tue, 6 Mar 2018 00:02:03 +0000"  >&lt;p&gt;I think the way to go about doing this would be to make the oplog buffer used during initial sync just be the oplog instead of a separate collection.  It was made a separate collection originally so that it could be uncapped and grow to unbounded size during initial sync.  We now, however, have the ability to resize the oplog dynamically, so we could just let the oplog grow unbounded during initial sync, then after initial sync set it to its configured size.  Then after leaving initial sync, the normal steady state oplog application logic would kick in, seeing that the lastOpApplied is behind the top of the oplog, so we&apos;d finish applying all the oplog we already have before fetching any new oplog.&lt;/p&gt;

&lt;p&gt;While this would definitely be an improvement, it still wouldn&apos;t get us to the point where we&apos;re completely not dependent on the size of the sync source&apos;s oplog during initial sync.  This is because after finishing initial sync, we may have a large buffer of ops we fetched during initial sync that we still need to apply, and we won&apos;t fetch any new oplog entries from our sync source until we&apos;ve caught up applying the ops we already have.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                                        </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="510849">SERVER-33866</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25128"><![CDATA[Replication]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[500A000000aAE2NIAW, 500A000000b9l68IAA, 5002K00000ve3RkQAI]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 6 Mar 2018 00:02:03 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 years, 49 weeks, 2 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>PM-1083</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>opal.hoyt@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 years, 49 weeks, 2 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>backlog-server-repl</customfieldvalue>
            <customfieldvalue>judah.schvimer@mongodb.com</customfieldvalue>
            <customfieldvalue>spencer@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htn5tr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hteq1b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htmry7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>