[KAFKA-272] Allow connector to only do Copy Existing and no Change Streams Created: 16/Dec/21 Updated: 27/Oct/23 Resolved: 05/Jan/22 |
|
| Status: | Closed |
| Project: | Kafka Connector |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Unknown |
| Reporter: | Daniel Barreto | Assignee: | Robert Walters |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | external-user | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
We have to read a lot of historic data (2+ TB) and it is taking too long because of CPU limitations (can't get connector to use all 4 cores), so we are thinking about having multiple instances of the connector to import different portions of the historic data but only one instance to do the change streams but we can't find a clean way to configure this since it seems all instances would automatically switch to the change stream once they are done with their corresponding Copy Existing portion. Thanks for any advice or help you can give us with this. |
| Comments |
| Comment by Robert Walters [ 05/Jan/22 ] |
|
Resolved by customer |
| Comment by Daniel Barreto [ 05/Jan/22 ] |
|
Yes, we figured it out. Thank You! |
| Comment by Robert Walters [ 04/Jan/22 ] |
|
daniel@haystack.tv You can configure multiple instances of the connector each with their own pipeline using copy.existing filter that would effectively take a portion of the data. https://docs.mongodb.com/kafka-connector/master/source-connector/usage-examples/copy-existing-data/#filter-data. E.g. connector 1 could have a filter where state='ny', connector 2 where state='ma', etc.. that kind of setup. Im not sure if your data can be broken up this way but this is one option. Would this work for you? |
| Comment by Esha Bhargava [ 21/Dec/21 ] |
|
daniel@haystack.tv Thank you for reporting the issue! We'll look into it and get back to you soon. |