[KAFKA-315] Resume copy from where it left off after a restart Created: 11/May/22  Updated: 12/Dec/22

Status: Backlog
Project: Kafka Connector
Component/s: Source
Affects Version/s: 1.6.1
Fix Version/s: None

Type: Improvement Priority: Unknown
Reporter: Colin Smetz Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on KAFKA-61 Improve Source Connector Performance ... Backlog
Epic Link: Kafka Source Scalability

 Description   

When setting the copy.existing option to true, the source connector first performs a copy of the whole MongoDB collection. However, if the connector restarts before the copy is finished, then the copy restarts from scratch.

This is an issue for large collections that take a lot of time to ingest because:

  • The probability of a restart happening during the copy is more important
  • The impact is more important: a lot of duplicates are written to Kafka and the process takes more time to finish than expected.

It would be great if there was some failure recovery mechanism, that could make sure that the copy resumes from where it left off before the restart of the connector.



 Comments   
Comment by Robert Walters [ 10/Oct/22 ]

Added ticket as part of the Kafka Source Performance/Scale improvement

Comment by PM Bot [ 01/Jun/22 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information.

Comment by Colin Smetz [ 18/May/22 ]

I had given all the details in this forum discussion.

Basically:

  • We had an issue a while ago causing Kafka Connect to restart 2-3 times per hour. Of course we have first fixed that issue. But we can't guarantee that similar unexpected issues will happen again, and ideally, we would like our connectors to be resilient to that.
  • We also have a daily restart that we are required to do.
  • We do not need to replicate the data to another MongoDB cluster. We need to ingest the data on Kafka to be used by other tools on our side that work specifically with Kafka. So we can’t bypass Kafka.

So while it would make our lives easier, it is not critical either. We haven't had a case where it was impossible to perform the initial copy before the next restart, but ideally we'd rather be sure that we have a solution if that case happened.

Comment by Robert Walters [ 16/May/22 ]

colin.smetz@euranova.eu How often is your connector failing / Can you try to address that issue? This is a complex issue to solve and one that hasn't come up.  What is your use case? Are you looking to  just replicate MongoDB data from source to sink?

Generated at Thu Feb 08 09:06:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.