[KAFKA-315] Resume copy from where it left off after a restart Created: 11/May/22 Updated: 12/Dec/22 |
|
| Status: | Backlog |
| Project: | Kafka Connector |
| Component/s: | Source |
| Affects Version/s: | 1.6.1 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Unknown |
| Reporter: | Colin Smetz | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Epic Link: | Kafka Source Scalability | ||||||||
| Description |
|
When setting the copy.existing option to true, the source connector first performs a copy of the whole MongoDB collection. However, if the connector restarts before the copy is finished, then the copy restarts from scratch. This is an issue for large collections that take a lot of time to ingest because:
It would be great if there was some failure recovery mechanism, that could make sure that the copy resumes from where it left off before the restart of the connector. |
| Comments |
| Comment by Robert Walters [ 10/Oct/22 ] |
|
Added ticket as part of the Kafka Source Performance/Scale improvement |
| Comment by PM Bot [ 01/Jun/22 ] |
|
There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information. |
| Comment by Colin Smetz [ 18/May/22 ] |
|
I had given all the details in this forum discussion. Basically:
So while it would make our lives easier, it is not critical either. We haven't had a case where it was impossible to perform the initial copy before the next restart, but ideally we'd rather be sure that we have a solution if that case happened. |
| Comment by Robert Walters [ 16/May/22 ] |
|
colin.smetz@euranova.eu How often is your connector failing / Can you try to address that issue? This is a complex issue to solve and one that hasn't come up. What is your use case? Are you looking to just replicate MongoDB data from source to sink? |