<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 07:40:10 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[DOCS-1064] Not clear what state a very stale replica node will exhibit </title>
                <link>https://jira.mongodb.org/browse/DOCS-1064</link>
                <project id="10380" key="DOCS">Documentation</project>
                    <description>&lt;p&gt;The manual says that if a mongo node gets too far behind and the oplog wraps then it won&apos;t be able to catch up and it will be stuck in that state without manual intervention.  &lt;/p&gt;

&lt;p&gt;I&apos;m trying to write a script to detect that state and alarm on my cluster but it&apos;s not clear how to detect it.  The link above gives a list of the possible cluster member states (as reported by rs.status().members&lt;span class=&quot;error&quot;&gt;&amp;#91;n&amp;#93;&lt;/span&gt;.status) but it&apos;s not clear how you can tell apart &quot;stale but able to catch up&quot; and &quot;very stale, cannot catch up&quot;.  Are those states distinguished, if so, what do they map to or do I need to look elsewhere?&lt;/p&gt;</description>
                <environment>&lt;a href=&quot;http://docs.mongodb.org/manual/reference/replica-status/&quot;&gt;http://docs.mongodb.org/manual/reference/replica-status/&lt;/a&gt;</environment>
        <key id="63589">DOCS-1064</key>
            <summary>Not clear what state a very stale replica node will exhibit </summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="bgrabar">Bob Grabar</assignee>
                                    <reporter username="fasaxc">Shaun Crampton</reporter>
                        <labels>
                    </labels>
                <created>Tue, 29 Jan 2013 01:12:35 +0000</created>
                <updated>Mon, 30 Oct 2023 21:57:06 +0000</updated>
                            <resolved>Mon, 15 Apr 2013 22:26:16 +0000</resolved>
                                    <version>mongodb-2.2</version>
                                    <fixVersion>Server_Docs_20231030</fixVersion>
                                    <component>manual</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                                                <comments>
                            <comment id="252996" author="fasaxc" created="Tue, 29 Jan 2013 17:56:56 +0000"  >&lt;p&gt;Thanks, that&apos;s very clear now.  It&apos;s easy to understand how hard it is to handle the case where the replication simply isn&apos;t fast enough.  Thankfully, I don&apos;t think that will be an issue in my application.&lt;/p&gt;</comment>
                            <comment id="252990" author="samk" created="Tue, 29 Jan 2013 17:48:25 +0000"  >&lt;p&gt;If the instance can&apos;t catch up and must enter initial sync, then this operation happens automatically. &lt;/p&gt;

&lt;p&gt;You may be able to monitor replication lag using the value of optimeDate (&lt;a href=&quot;http://docs.mongodb.org/manual/reference/replica-status/#replSetGetStatus.members.optimeDate&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/reference/replica-status/#replSetGetStatus.members.optimeDate&lt;/a&gt;) from all members of the set. To determine the &quot;length&quot; of the oplog in time you&apos;ll need to know the average size of each oplog entry and the frequency of operations, as well as the size of the oplog on all machines that may become primary. &lt;/p&gt;

&lt;p&gt;In many situations its difficult to predict what will happen with regard to replication lag. Under some loads, a replica could fall behind the state of the primary by an hour (say) during a large bulk operation and then reliably catch up. For other deployments, falling behind by more than 15 minutes might be unrecoverable via normal replication (though the conditions required to reproduce this are probably pathological.)  &lt;/p&gt;

&lt;p&gt;During the rollback operation (which may take several moments) the members.state value (&lt;a href=&quot;http://docs.mongodb.org/manual/reference/replica-status/#replSetGetStatus.members.state&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://docs.mongodb.org/manual/reference/replica-status/#replSetGetStatus.members.state&lt;/a&gt;) will be &quot;9&quot; for rollback.&lt;/p&gt;

&lt;p&gt;While we will improve the documentation on this point, you should also be aware that MMS, which is a free service, monitors replication lag and can issue generic alerts that will alert on the kind of events that you need to know about for this case.&lt;/p&gt;</comment>
                            <comment id="252958" author="fasaxc" created="Tue, 29 Jan 2013 17:11:22 +0000"  >&lt;p&gt;Thanks for the response, that clarifies the behavior somewhat.  The thing I&apos;m still puzzling over is the best way to recognize the bad states that you mention.&lt;/p&gt;

&lt;p&gt;If the instance can&apos;t catch up because the most recent oplog entry isn&apos;t in the primary&apos;s oplog then does the instance indicate an error?  Is there some other way to tell that an instance has got stuck in that state (e.g. can I somehow look at the oplog times for the different instances and compare them to work out that the instance will never catch up?)&lt;/p&gt;

&lt;p&gt;I think the rollback case is easier to spot because I can check for a rollback log.&lt;/p&gt;</comment>
                            <comment id="252379" author="samk" created="Tue, 29 Jan 2013 01:47:28 +0000"  >&lt;p&gt;We&apos;ll make a note to clarify the language here. &lt;/p&gt;

&lt;p&gt;The problem is less cut and dry: the oplog is a fixed size, that depends on the amount of free disk space when the member was added to the replica set (or when the set was initiated for the first member.) Replication is asynchronous, and a member can stop fetching operations from the oplog and then resume replicating normally as long as the most recent entry in a member&apos;s oplog still exists in the primary&apos;s oplog. If the member&apos;s most recent oplog entry is not in the primary&apos;s oplog, then there&apos;s no way for the normal replication process to ensure that all members of the database have identical data set. So we say that the member can&apos;t catch up and must run the &quot;initial sync&quot; routine which allows it to copy the data state from a known &quot;up to date&quot; member of the set and then apply the collected oplog entries. The result is a member with an equivalent data set as all other members. &lt;/p&gt;

&lt;p&gt;As the name implies &quot;initial sync&quot; is the process replica set members use to synchronize when they are first added to the set. &lt;/p&gt;

&lt;p&gt;If initial sync cannot complete within the &quot;oplog window,&quot; which is to say if the first oplog entry copied during the sync, is not present in the primary&apos;s oplog when the initial sync is complete and it begins applying oplog entries, then the replica set member will be stuck in a sort of endless loop unable to obtain a data set equivalent to the other members of the set. To remedy this you will need to reduce the rate of operations on the primary during the initial sync, or you will need to find a way of syncing the member more quickly (i.e. copying data files from a snapshot of a working secondary, etc.) &lt;/p&gt;

&lt;p&gt;The only part of this process that may systematically require manual intervention is when a member of a replica set enters &quot;rollback&quot; state, or where the member having previously been primary is disconnected from the set. If the primary accepts write operations before it realizes that its secondaries are no longer connected &lt;b&gt;and&lt;/b&gt; the secondaries elect a new primary, then there are a number of operations on the old primary that aren&apos;t in the new primary and therefore the rest of the set. In this situation the old primary will &quot;undo&quot; or &quot;rollback&quot; the operations that it has that are not present in the &quot;new&quot; set. The old primary writes these documents out in .bson files in the dbpath, and you can inspect and save (or not) the data as needed.&lt;/p&gt;

&lt;p&gt;Does this help?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="60426">DOCS-927</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Tue, 29 Jan 2013 01:47:28 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        11 years, 3 weeks, 1 day ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>emet.ozar@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            11 years, 3 weeks, 1 day ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>bgrabar</customfieldvalue>
            <customfieldvalue>sam.kleinman</customfieldvalue>
            <customfieldvalue>fasaxc</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrs2jz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrlvb3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>41136</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hryfen:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                </customfields>
    </item>
</channel>
</rss>