<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 05:02:36 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-43226] Primary takeover during ReplSetTest.stop can hang due to fsyncLock</title>
                <link>https://jira.mongodb.org/browse/SERVER-43226</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;A rare race, observed in&#160;catchup_takeover_two_nodes_ahead.js:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The test reaches the end and calls ReplSetTest.stop&lt;/li&gt;
	&lt;li&gt;ReplSetTest calls getPrimary() and determines that Node 0 is the primary&lt;/li&gt;
	&lt;li&gt;For unexpected reasons an election and catchup takeover occur (e.g., Node 0 was blocked by LogKeeper and didn&apos;t do heartbeats for 10+ seconds), Node 0 begins to step down and reports ismaster: false&lt;/li&gt;
	&lt;li&gt;ReplSetTest proceeds to the point where it fsyncLocks Node 0 before checking dbhashes, because it still thinks Node 0 is primary&lt;/li&gt;
	&lt;li&gt;The new primary cannot complete catchup: queries on Node 0&apos;s oplog time out because Node 0 is fsyncLocked&lt;/li&gt;
	&lt;li&gt;Deeper in ReplSetTest&apos;s dbhash code it calls getPrimary() again, but no nodes report ismaster: true so getPrimary() times out&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Proposals:&lt;/p&gt;

&lt;p&gt;1. Currently, ReplSetTest.checkReplicaSet() fsyncLocks the primary before comparing replicas&apos; dbhashes, oplogs, and/or collection counts. There&apos;s a comment saying the fsyncLock&apos;s purpose is to prevent TTL indexes from reaping documents during these checks. We could call setParameter with ttlMonitorEnabled: false instead of fsyncLock, but I tried that and got test failures due to dbhash mismatches. As I had feared, it&apos;s not only TTL indexes that cause dbhash mismatches, so we need the blunt tool of fsyncLock to prevent &lt;b&gt;any&lt;/b&gt; changes.&lt;/p&gt;

&lt;p&gt;2. All ReplSetTest code that runs with the primary fsyncLocked could be updated so it works while all replicas report ismaster: false, by using the cached self._master value instead of calling getPrimary(). This is a big code change, and it&apos;s hard to maintain. Next year we might change the code, accidentally call getPrimary() while the primary is fsyncLocked, and introduce a new rare BF without knowing it right away.&lt;/p&gt;

&lt;p&gt;3. Sames as #2, but in getPrimary(), if no replica reports ismaster: true, assert no replicas are fsyncLocked. This will help catch mistakes in the future and help diagnose BFs like the current one, however, it&apos;s still a big change to brittle ReplSetTest code.&lt;/p&gt;

&lt;p&gt;4. Add a retry loop in ReplSetTest.checkReplicaSet(): fsyncLock the primary, try to complete a check. If it fails, unlock, wait for a primary, and relock.&lt;/p&gt;

&lt;p&gt;I&apos;ll submit Proposal 4 for review.&lt;/p&gt;</description>
                <environment></environment>
        <key id="920313">SERVER-43226</key>
            <summary>Primary takeover during ReplSetTest.stop can hang due to fsyncLock</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13201">Fixed</resolution>
                                        <assignee username="jesse@mongodb.com">A. Jesse Jiryu Davis</assignee>
                                    <reporter username="jesse@mongodb.com">A. Jesse Jiryu Davis</reporter>
                        <labels>
                    </labels>
                <created>Mon, 9 Sep 2019 13:12:08 +0000</created>
                <updated>Sun, 29 Oct 2023 22:17:23 +0000</updated>
                            <resolved>Fri, 27 Sep 2019 15:43:39 +0000</resolved>
                                                    <fixVersion>4.3.1</fixVersion>
                                    <component>Replication</component>
                                        <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="2436472" author="xgen-internal-githook" created="Fri, 27 Sep 2019 14:22:53 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;username&apos;: &apos;ajdavis&apos;, &apos;email&apos;: &apos;jesse@mongodb.com&apos;, &apos;name&apos;: &apos;A. Jesse Jiryu Davis&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-43226&quot; title=&quot;Primary takeover during ReplSetTest.stop can hang due to fsyncLock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-43226&quot;&gt;&lt;del&gt;SERVER-43226&lt;/del&gt;&lt;/a&gt; ReplSetTest.stop can hang due to fsyncLock&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/9b470eb73873f5db5c9fcee5df5316d477a1fa12&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/9b470eb73873f5db5c9fcee5df5316d477a1fa12&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2434251" author="xgen-internal-githook" created="Thu, 26 Sep 2019 12:42:56 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;name&apos;: &apos;A. Jesse Jiryu Davis&apos;, &apos;username&apos;: &apos;ajdavis&apos;, &apos;email&apos;: &apos;jesse@mongodb.com&apos;}
&lt;p&gt;Message: Revert &quot;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-43226&quot; title=&quot;Primary takeover during ReplSetTest.stop can hang due to fsyncLock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-43226&quot;&gt;&lt;del&gt;SERVER-43226&lt;/del&gt;&lt;/a&gt; ReplSetTest.stop can hang due to fsyncLock&quot;&lt;/p&gt;

&lt;p&gt;This reverts commit 7a9cefdabf985360ada4ac7569b0682320075edf.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/fc0db84002fd553bcf0f7d7a153d79bbba75cf34&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/fc0db84002fd553bcf0f7d7a153d79bbba75cf34&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="2434236" author="jesse" created="Thu, 26 Sep 2019 12:27:20 +0000"  >&lt;p&gt;Re-opening: my previous attempt caused build failures. I&apos;ve reverted it and I&apos;ll investigate.&lt;/p&gt;</comment>
                            <comment id="2433719" author="xgen-internal-githook" created="Thu, 26 Sep 2019 00:39:57 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{&apos;username&apos;: &apos;ajdavis&apos;, &apos;email&apos;: &apos;jesse@mongodb.com&apos;, &apos;name&apos;: &apos;A. Jesse Jiryu Davis&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-43226&quot; title=&quot;Primary takeover during ReplSetTest.stop can hang due to fsyncLock&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-43226&quot;&gt;&lt;del&gt;SERVER-43226&lt;/del&gt;&lt;/a&gt; ReplSetTest.stop can hang due to fsyncLock&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/7a9cefdabf985360ada4ac7569b0682320075edf&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/7a9cefdabf985360ada4ac7569b0682320075edf&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                                                <inwardlinks description="is depended on by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10520">
                    <name>Problem/Incident</name>
                                            <outwardlinks description="causes">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10011" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Backwards Compatibility</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10038"><![CDATA[Fully Compatible]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 26 Sep 2019 00:39:57 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        4 years, 19 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_17050" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Downstream Team Attention</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="16941"><![CDATA[Not Needed]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            4 years, 19 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_16465" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Linked BF Score</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>47.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>jesse@mongodb.com</customfieldvalue>
            <customfieldvalue>xgen-internal-githook</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvp3zr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hvdsyn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="3201">Repl 2019-09-09</customfieldvalue>
    <customfieldvalue id="3202">Repl 2019-09-23</customfieldvalue>
    <customfieldvalue id="3260">Repl 2019-10-07</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hvoq93:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>