<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 09:02:20 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[JAVA-4535] Driver should reconnect to replica set restarted as &apos;fresh&apos;</title>
                <link>https://jira.mongodb.org/browse/JAVA-4535</link>
                <project id="10006" key="JAVA">Java Driver</project>
                    <description>&lt;h4&gt;&lt;a name=&quot;Summary&quot;&gt;&lt;/a&gt;Summary&lt;/h4&gt;

&lt;p&gt;Driver is unable to reconnect to primary after replica set is restarted as &apos;fresh&apos;. In the driver,&#160;&lt;em&gt;maxSetVersion&lt;/em&gt; and&#160;&lt;em&gt;maxElectionId&lt;/em&gt; are stored in memory, but when replica set is shut down and restarted from scratch (with new data directories), elections are also done from scratch and no longer comparable to the ones stored in driver.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Driver version:&lt;/b&gt; 4.4.2&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Mongo version:&lt;/b&gt;&#160;5.0.3&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Topology:&lt;/b&gt; replica set, 3 members, 1 primary, 2 secondaries&lt;/p&gt;
&lt;h4&gt;&lt;a name=&quot;HowtoReproduce&quot;&gt;&lt;/a&gt;How to Reproduce&lt;/h4&gt;

&lt;p&gt;1. Setup a Java client connected to a mongo replica set.&lt;/p&gt;

&lt;p&gt;2. Trigger a few elections to overwrite&#160;&lt;em&gt;maxElectionId.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;3. Shutdown the replica set and wipe out all the data.&lt;/p&gt;

&lt;p&gt;4. Start up the replica set again.&lt;/p&gt;

&lt;p&gt;5. Java app will not be able to reconnect to primary and perform writes (also reads if readPreference is &lt;em&gt;primary&lt;/em&gt;).&lt;/p&gt;
&lt;h4&gt;&lt;a name=&quot;AdditionalBackground&quot;&gt;&lt;/a&gt;Additional Background&lt;/h4&gt;

&lt;p&gt;In my setup, I have a Java application connected to a mongo replica set with 3 members (primary, secondary, secondary). I want to test a Disaster-Recovery scenario, so I shutdown the replica set and wipe out all the data. Then I start up the replica set from scratch and restore the data from backups. After that, the still-running Java app is unable to reconnect to primary to perform write operations.&lt;/p&gt;

&lt;p&gt;The exceptions that are thrown look like this:&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;com.mongodb.MongoTimeoutException: Timed out after &lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;30000&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; ms &lt;/span&gt;&lt;span style=&quot;color: #006699; font-weight: bold; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;while&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; waiting &lt;/span&gt;&lt;span style=&quot;color: #006699; font-weight: bold; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;for&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; a server that matches WritableServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=mongo1.host:&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;27017&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;, type=UNKNOWN, state=CONNECTING}, {address=mongo2.host:&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;27017&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;, type=REPLICA_SET_SECONDARY, roundTripTime=&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;0.9&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; ms, state=CONNECTED}, {address=mongo3.host:&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;27017&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;, type=REPLICA_SET_SECONDARY, roundTripTime=&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;1.5&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; ms, state=CONNECTED}]&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;
&lt;p&gt;But the underlying issue is this:&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;org.mongodb.driver.cluster: Invalidating potential primary mongo1.host:&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;27017&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; whose (set version, election id) tuple of (&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;5&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;, 7fffffff0000000000000002) is less than one already seen of (&lt;/span&gt;&lt;span style=&quot;color: #009900; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;13&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;, 7fffffff0000000000000013)&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;
&lt;p&gt;So it seems that the driver is unable to connect to the &apos;new&apos; primary, because it claims that it has seen a primary with higher electionId, but in the meantime the whole replica set was restarted and elections were done from scratch.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2002816">JAVA-4535</key>
            <summary>Driver should reconnect to replica set restarted as &apos;fresh&apos;</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="10300" iconUrl="https://jira.mongodb.org/images/icons/priorities/medium.svg">Unknown</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13203">Gone away</resolution>
                                        <assignee username="jeff.yemin@mongodb.com">Jeffrey Yemin</assignee>
                                    <reporter username="tymoteuszm@qualtrics.com">Tymoteusz Machowski</reporter>
                        <labels>
                            <label>external-user</label>
                    </labels>
                <created>Wed, 16 Mar 2022 10:59:05 +0000</created>
                <updated>Fri, 27 Oct 2023 19:48:32 +0000</updated>
                            <resolved>Wed, 13 Apr 2022 12:00:34 +0000</resolved>
                                    <version>4.4.2</version>
                                                    <component>Cluster Management</component>
                                        <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="4479728" author="dbeng-pm-bot" created="Wed, 13 Apr 2022 12:00:37 +0000"  >&lt;p&gt;There hasn&apos;t been any recent activity on this ticket, so we&apos;re resolving it. Thanks for reaching out! Please feel free to comment on this if you&apos;re able to provide more information.&lt;/p&gt;</comment>
                            <comment id="4443610" author="jeff.yemin" created="Tue, 29 Mar 2022 23:47:48 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tymoteuszm%40qualtrics.com&quot; class=&quot;user-hover&quot; rel=&quot;tymoteuszm@qualtrics.com&quot;&gt;tymoteuszm@qualtrics.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We&apos;ve discussed internally and determined that this can&apos;t really be addressed solely on the client side.  The setVersion and electionId checks serve an important function in preventing drivers from using stale primaries in split-brain network scenarios, so we don&apos;t want to remove those checks.&lt;/p&gt;

&lt;p&gt;Anything else we could do to address would appear to require additional functionality in the MongoDB server, but it&apos;s unlikely to be prioritized unless we get signal from more users that it&apos;s an important use case for them.&lt;/p&gt;

&lt;p&gt;As a workaround you can do the same as what Atlas itself does when it restored from a backup: restore the &lt;a href=&quot;http://example.com&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;local.system.replset&lt;/a&gt; collection so that the electionId/setVersion values are not seen as stale by clients.&lt;/p&gt;</comment>
                            <comment id="4442095" author="JIRAUSER1265501" created="Tue, 29 Mar 2022 15:33:55 +0000"  >&lt;p&gt;This is not (at least not only) related to Atlas. I was describing a case with own hosted mongodb, and a DNS/service registry handling connecting to a different (newly created) cluster under the same name. But I can imagine it could happen with Atlas too, with restoring a backup in place as you mentioned.&#160;&lt;/p&gt;</comment>
                            <comment id="4435626" author="andrew.davidson@10gen.com" created="Fri, 25 Mar 2022 16:44:01 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tymoteuszm%40qualtrics.com&quot; class=&quot;user-hover&quot; rel=&quot;tymoteuszm@qualtrics.com&quot;&gt;tymoteuszm@qualtrics.com&lt;/a&gt;&#160;does this happen in Atlas?&lt;/p&gt;

&lt;p&gt;I would imagine the closest parallel in Atlas would be when you restore a backup in place to an existing cluster? (bc where you do a restore to a distinct cluster you have to change you connection string anyway &#8211; does that sound right? cc &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=benjamin.cefalo&quot; class=&quot;user-hover&quot; rel=&quot;benjamin.cefalo&quot;&gt;benjamin.cefalo&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="4435526" author="JIRAUSER1265501" created="Fri, 25 Mar 2022 16:13:32 +0000"  >&lt;p&gt;Thanks for the comments.&lt;/p&gt;

&lt;p&gt;The entire replacement of a replica set could happen e.g. in a Disaster Recovery scenario, where machines hosting mongo or disks fail completely, and the replica set has to be set up from scratch on different machines, with data recovered from backups (but not including replica set state or configuration, which is stored e.g. statically in a separate configuration system).&lt;/p&gt;

&lt;p&gt;It would be nice if client applications using the driver wouldn&apos;t have to be restarted in such case, but rather they would automatically reconnect to the &quot;new&quot; replica set.&lt;/p&gt;</comment>
                            <comment id="4433685" author="jeff.yemin" created="Thu, 24 Mar 2022 20:55:36 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=tymoteuszm%40qualtrics.com&quot; class=&quot;user-hover&quot; rel=&quot;tymoteuszm@qualtrics.com&quot;&gt;tymoteuszm@qualtrics.com&lt;/a&gt; thanks for bringing this to our attention. As you correctly deduced, this is the expected behavior given the algorithm that the driver implements to prevent sending writes to stale primaries. &lt;/p&gt;

&lt;p&gt;Currently, we are not aware of any production use cases/scenarios where it&apos;s necessary to completely replace a replica set as you described.  I can see how this could happen in a dev environment, though I&apos;ve never encountered it in my own development.&lt;/p&gt;

&lt;p&gt;If you could describe your use case (for replacing a replica set entirely) in more detail, that would help us to decide whether it&apos;s something that we think should be addressed.&lt;/p&gt;</comment>
                            <comment id="4433060" author="lamont.nelson" created="Thu, 24 Mar 2022 17:49:15 +0000"  >&lt;p&gt;If I understand the scenario, we are resetting the (electionId, setVersion) tuple to an initial value. For any RSM instances that are monitoring this replica set, they would see a response from the primary as stale and &lt;a href=&quot;https://github.com/mongodb/mongo/blob/25122c60597d66c1901502c1fd1c203392d4963e/src/mongo/client/sdam/topology_state_machine.cpp#L288&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;ignore it&lt;/a&gt; causing unavailability until the node is restarted or until the tuple increases past the cached version.&lt;/p&gt;

&lt;p&gt;I think the question of why the election id doesn&apos;t fill in the timestamp portion is probably best for the replication team, but I&apos;ll take a stab at interpreting the code. I think the value originates from &lt;a href=&quot;https://github.com/mongodb/mongo/blob/25122c60597d66c1901502c1fd1c203392d4963e/src/mongo/db/repl/replication_coordinator_impl.cpp#L4527&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;. So it is derived from the raft term and a timestamp. However &lt;a href=&quot;https://github.com/mongodb/mongo/blob/25122c60597d66c1901502c1fd1c203392d4963e/src/mongo/bson/oid.cpp#L147&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt; we have something interesting. The comments state &quot;Set max timestamp because the drivers compare ElectionId&apos;s to determine valid new primaries, and we want ElectionId&apos;s with terms to supercede ones without terms.&quot; This is new information for me personally. So we know why the high bits are the max value, but as far as why we coded this way I am not sure.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1908855">JAVA-4375</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i06hss:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            </customfields>
    </item>
</channel>
</rss>