[KAFKA-140] Add behavior.on.malformed.documents config to Sink Connector Created: 11/Aug/20 Updated: 21/Sep/20 Resolved: 21/Sep/20 |
|
| Status: | Closed |
| Project: | Kafka Connector |
| Component/s: | Sink |
| Affects Version/s: | 1.2.0 |
| Fix Version/s: | 1.3.0 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | J vH | Assignee: | Ross Lawley |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Epic Link: | Error Handling | ||||||||
| Description |
|
When we send data from Kafka to MongoDB using the Sink Connector, MongoDB automatically detects the data type for each field. So far so good. When the Kafka payload on the topic changes - meaning that an existing field suddenly has a new data type - the sink connector fails. This is not in any way an unexpected behavior. However, an erroneous message, in Kafka terminology referred to as a poison pill, ruins the topic/sink connector combination. At a connector restart the latest offset is the new starting point and the connector immediately fails again on the same message. Changing the consumer-group offset is the only way to work around this. Nice on a testing environment, but unacceptable in production. Extensive documentation is dedicated to configuring errors.tolerance (https://docs.mongodb.com/kafka-connector/master/kafka-sink-properties/). However and unfortunately, the generic error.tolerance connector settings only apply to the (de)serialization and SMT phase, and not to the actual connector code itself. In our case, and potentially many others, we are powerless when it comes to dealing with exceptions in the actual sink phase of the connector. What would be highly desirable is a configuration property equal to ElastisSearch sink connector's behavior.on.malformed.documents https://docs.confluent.io/current/connect/kafka-connect-elasticsearch/configuration_options.html
Are there any plans already for such a setting? |
| Comments |
| Comment by Ross Lawley [ 21/Sep/20 ] |
|
Reusing the existing errors.tolerance setting will handle invalid records / poison pills |
| Comment by Ross Lawley [ 11/Aug/20 ] |
|
Hi jeffrey.vanhelden@thewarehouse.co.nz, Thanks for the ticket, there is an epic for improving error handling, which I've linked. This is probably a duplicate of Ross |