[KAFKA-75] New id strategy that doesn't reuse configuration parameters. Created: 02/Dec/19  Updated: 28/Oct/23  Resolved: 26/Jun/20

Status: Closed
Project: Kafka Connector
Component/s: None
Affects Version/s: None
Fix Version/s: 1.2.0

Type: Improvement Priority: Major - P3
Reporter: Seth Payne Assignee: Ross Lawley
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documented
Related
is related to KAFKA-112 Add include and exclude lists for fie... Closed
Case:
Documentation Changes: Needed
Documentation Changes Summary:

`PartialValueStrategy` and `PartialKeyStrategy` now have namespaced configurations. So use:

`document.id.strategy.partial.value.projection.type`, `document.id.strategy.partial.value.projection.list`,
`document.id.strategy.partial.key.projection.type` and `document.id.strategy.partial.key.projection.list`.

instead of [key|value].projection.[type|list].

Remove note about PROJECTION POST PROCESSORS ARE NOT COMPATIBLE WITH PARTIALVALUESTRATEGY


 Description   

We want to support a solution to config more than one processor on the chain of post processors.

Consider the following example:

  "topic.override.source.document.id.strategy":"com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
     "topic.override.source.collection":"sink",
     "topic.override.source.value.projection.type":"whitelist",
     "topic.override.source.value.projection.list":"attuid",
     "topic.override.source.writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy",
 
     "topic.override.source.post.processor.chain":"com.mongodb.kafka.connect.sink.processor.WhitelistValueProjector",
     "topic.override.source.collection":"sink",
     "topic.override.source.value.projection.type":"whitelist",
     "topic.override.source.value.projection.list":"attuid, name, pc",
     "topic.override.source.batch.size":"100","name":"mongo-sink"}

Currently, this fails to update correctly the MongoDB documents. The current design only allows one processor to be configured as it is not possible to configure settings for each of the processors.

This work is to implement a new id strategy that does not reuse configuration parameters.



 Comments   
Comment by Githook User [ 26/Jun/20 ]

Author:

{'name': 'Ross Lawley', 'email': 'ross.lawley@gmail.com', 'username': 'rozza'}

Message: Add specific configurations for Partial[Key|Value]Strategies

Added `document.id.strategy.partial.key.` configurations and
`document.id.strategy.partial.value.` configurations.

So that users can both project fields as part of the Id Strategy
as well as project any fields they want to store.

KAFKA-75
Branch: master
https://github.com/mongodb/mongo-kafka/commit/a820953e2ba6bc272cc7f76b94b0b012437c2481

Comment by Ross Lawley [ 03/Dec/19 ]

seth.payne marking as an improvement as this is how the connector was originally designed to work.

There is nothing stopping custom processors using their own parameters, its just you can't double define the projection configuration for the PartialValueStrategy and also use them for the WhitelistValueProjector. I believe the current workaround would be to use SMT (Single Message Transformations) that are native to Kafka to process the data first and project the required fields.

Future work should use a name based convention for configuration for the Id strategies. Keeping backwards compatibility may be a challenge.

Generated at Thu Feb 08 09:05:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.