[KAFKA-214] Allow for partitioning based on documentId Created: 30/Mar/21  Updated: 27/Oct/23  Resolved: 12/May/21

Status: Closed
Project: Kafka Connector
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Brian Begy Assignee: Ross Lawley
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
Related
related to KAFKA-135 Support collection partitioning when ... Backlog
Case:
Documentation Changes: Needed
Documentation Changes Summary:

Presumably, documentation would be needed to explain how to configure it.


 Description   

The problem:  

If your source is publishing to a topic with more than 1 partition, we cannot guarantee that sinks will consume a document's changes in the order in which they occur.  Create could go to partition 0, update to 1, and delete to 2.  A slow sink on 0 and a fast sink on 2 means that we could get a delete before a create.

If we could assign the partition based on the documentId, we could ensure that a given document is processed in the right order.  



 Comments   
Comment by Ross Lawley [ 12/May/21 ]

The functionality already exists (since 1.3.0) see the output.schema.key property for the MongoDB Kafka Source Connector.

Usage is also covered this blog post under the section *Write Data to Specific Partitions*. Also see this configuration example.

Comment by Ross Lawley [ 11/May/21 ]

Marked for the 1.6.0 release.

Generated at Thu Feb 08 09:05:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.