[KAFKA-140] Add behavior.on.malformed.documents config to Sink Connector Created: 11/Aug/20  Updated: 21/Sep/20  Resolved: 21/Sep/20

Status: Closed
Project: Kafka Connector
Component/s: Sink
Affects Version/s: 1.2.0
Fix Version/s: 1.3.0

Type: New Feature Priority: Major - P3
Reporter: J vH Assignee: Ross Lawley
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates KAFKA-105 Support errors.tolerance Closed
Epic Link: Error Handling

 Description   

When we send data from Kafka to MongoDB using the Sink Connector, MongoDB automatically detects the data type for each field. So far so good.

When the Kafka payload on the topic changes - meaning that an existing field suddenly has a new data type - the sink connector fails.  

This is not in any way an unexpected behavior. 

However, an erroneous message, in Kafka terminology referred to as a poison pill, ruins the topic/sink connector combination. At a connector restart the latest offset is the new starting point and the connector immediately fails again on the same message.

Changing the consumer-group offset is the only way to work around this. Nice on a testing environment, but unacceptable in production.  

Extensive documentation is dedicated to configuring errors.tolerance (https://docs.mongodb.com/kafka-connector/master/kafka-sink-properties/). However and unfortunately, the generic error.tolerance connector settings only apply to the (de)serialization and SMT phase, and not to the actual connector code itself. In our case, and potentially many others, we are powerless when it comes to dealing with exceptions in the actual sink phase of the connector.  

What would be highly desirable is a configuration property equal to ElastisSearch sink connector's behavior.on.malformed.documents

https://docs.confluent.io/current/connect/kafka-connect-elasticsearch/configuration_options.html

 

Are there any plans already for such a setting?



 Comments   
Comment by Ross Lawley [ 21/Sep/20 ]

Reusing the existing errors.tolerance setting will handle invalid records / poison pills

Comment by Ross Lawley [ 11/Aug/20 ]

Hi jeffrey.vanhelden@thewarehouse.co.nz,

Thanks for the ticket, there is an epic for improving error handling, which I've linked. This is probably a duplicate of KAFKA-105, but I'll keep open to ensure the poison pill scenario is covered.

Ross

Generated at Thu Feb 08 09:05:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.