Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Unknown
Fix Version/s: None
Affects Version/s: 1.6.1
Component/s: Source
Labels:
None

Epic Link:
Kafka Source Scalability

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

When setting the copy.existing option to true, the source connector first performs a copy of the whole MongoDB collection. However, if the connector restarts before the copy is finished, then the copy restarts from scratch.

This is an issue for large collections that take a lot of time to ingest because:

The probability of a restart happening during the copy is more important
The impact is more important: a lot of duplicates are written to Kafka and the process takes more time to finish than expected.

It would be great if there was some failure recovery mechanism, that could make sure that the copy resumes from where it left off before the restart of the connector.

depends on

KAFKA-61 Improve Source Connector Performance / Scalability

Backlog

Assignee:: Unassigned
Reporter:: Colin Smetz
Votes:: 1 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: May 11 2022 07:54:25 AM UTC
Updated:: Dec 12 2022 11:44:07 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates