<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 09:06:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[KAFKA-315] Resume copy from where it left off after a restart</title>
                <link>https://jira.mongodb.org/browse/KAFKA-315</link>
                <project id="16285" key="KAFKA">Kafka Connector</project>
                    <description>&lt;p&gt;When setting the copy.existing option to true, the source connector first performs a copy of the whole MongoDB collection. However, if the connector restarts before the copy is finished, then the copy restarts from scratch.&lt;/p&gt;

&lt;p&gt;This is an issue for large collections that take a lot of time to ingest because:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;The probability of a restart happening during the copy is more important&lt;/li&gt;
	&lt;li&gt;The impact is more important: a lot of duplicates are written to Kafka and the process takes more time to finish than expected.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;It would be great if there was some failure recovery mechanism, that could make sure that the copy resumes from where it left off before the restart of the connector.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2044309">KAFKA-315</key>
            <summary>Resume copy from where it left off after a restart</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="10300" iconUrl="https://jira.mongodb.org/images/icons/priorities/medium.svg">Unknown</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="colin.smetz@euranova.eu">Colin Smetz</reporter>
                        <labels>
                    </labels>
                <created>Wed, 11 May 2022 07:54:25 +0000</created>
                <updated>Mon, 12 Dec 2022 23:44:07 +0000</updated>
                                            <version>1.6.1</version>
                                                    <component>Source</component>
                                        <votes>1</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="4891399" author="robert.walters" created="Mon, 10 Oct 2022 16:56:41 +0000"  >&lt;p&gt;Added ticket as part of the Kafka Source Performance/Scale improvement&lt;/p&gt;</comment>
                            <comment id="4586194" author="dbeng-pm-bot" created="Wed, 1 Jun 2022 12:00:37 +0000"  >&lt;p&gt;There hasn&apos;t been any recent activity on this ticket, so we&apos;re resolving it. Thanks for reaching out! Please feel free to comment on this if you&apos;re able to provide more information.&lt;/p&gt;</comment>
                            <comment id="4555502" author="JIRAUSER1269882" created="Wed, 18 May 2022 06:32:22 +0000"  >&lt;p&gt;I had given all the details in &lt;a href=&quot;https://www.mongodb.com/community/forums/t/kafka-source-connector-copy-existing-restarts-from-scratch-if-interrupted/162100/3&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;this forum discussion.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Basically:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;We had an issue a while ago causing Kafka Connect to restart 2-3 times per hour. Of course we have first fixed that issue. But we can&apos;t guarantee that similar unexpected issues will happen again, and &lt;em&gt;ideally&lt;/em&gt;, we would like our connectors to be resilient to that.&lt;/li&gt;
	&lt;li&gt;We also have a daily restart that we are required to do.&lt;/li&gt;
	&lt;li&gt;We do not need to replicate the data to another MongoDB cluster. We need to ingest the data on Kafka to be used by other tools on our side that work specifically with Kafka. So we can&#8217;t bypass Kafka.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So while it would make our lives easier, it is not critical either. We haven&apos;t had a case where it was &lt;em&gt;impossible&lt;/em&gt; to perform the initial copy before the next restart, but ideally we&apos;d rather be sure that we have a solution if that case happened.&lt;/p&gt;</comment>
                            <comment id="4550362" author="robert.walters" created="Mon, 16 May 2022 16:31:47 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=colin.smetz%40euranova.eu&quot; class=&quot;user-hover&quot; rel=&quot;colin.smetz@euranova.eu&quot;&gt;colin.smetz@euranova.eu&lt;/a&gt; How often is your connector failing / Can you try to address that issue? This is a complex issue to solve and one that hasn&apos;t come up.&#160; What is your use case? Are you looking to&#160; just replicate MongoDB data from source to sink?&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Depends</name>
                                            <outwardlinks description="depends on">
                                        <issuelink>
            <issuekey id="902653">KAFKA-61</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10857" key="com.pyxis.greenhopper.jira:gh-epic-link">
                        <customfieldname>Epic Link</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>KAFKA-61</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hxky9o:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>