[KAFKA-243] Log Associated Connector Name with "Resume Token Not Found" Message Created: 09/Aug/21  Updated: 27/Oct/23  Resolved: 10/Aug/21

Status: Closed
Project: Kafka Connector
Component/s: Source
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Unknown
Reporter: Diego Rodriguez (Inactive) Assignee: Ross Lawley
Resolution: Works as Designed Votes: 0
Labels: internal-user
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Case:
Documentation Changes: Not Needed

 Description   

Hi Team!

Although the message for the "Resume Token Not Found" error has been greatly improved thanks to KAFKA-91, it is still not easy to identify which connector throws the exception:

[2021-08-04 10:00:07,176] WARN Failed to resume change stream: Resume of change stream was not possible, as the resume point may no longer be
 in the oplog. 286
 
=====================================================================================
If the resume token is no longer available then there is the potential for data loss.
Saved resume tokens are managed by Kafka and stored with the offset data.
 
To restart the change stream with no resume token either: 
  * Create a new partition name using the `offset.partition.name` configuration.
  * Set `errors.tolerance=all` and ignore the erroring resume token. 
  * Manually remove the old offset from its configured storage.
 
Resetting the offset will allow for the connector to be resume from the latest resume
token. Using `copy.existing=true` ensures that all data will be outputted by the
connector but it will duplicate existing data.
=====================================================================================
 (com.mongodb.kafka.connect.source.MongoSourceTask:422)

Understanding that one Source Connector opens a single Change Stream to the source MongoDB Cluster, would it be possible to include the Source Connector name within the error message? This would allow for easier discovery of the affected Connector, specially in distributed environments where several other Connectors might be running.

Eventually, logging the associated Source (or Sink) Connector name would be appreciated for any other errors where identifying the associated Connector might be otherwise complicated and painful.

Thanks
Diego



 Comments   
Comment by Ross Lawley [ 10/Aug/21 ]

Hi diego.rodriguez,

I believe this functionality is already available via Kafka & Logging configuration:

With Apache Kafka 2.3, Mapped Diagnostic Context (MDC) logging is available, giving much more context in the logs:

INFO [sink-elastic-orders-00|task-0] Using multi thread/connection supporting pooling connection manager (io.searchbox.client.JestClientFactory:223)
INFO [sink-elastic-orders-00|task-0] Using default GSON instance (io.searchbox.client.JestClientFactory:69)
INFO [sink-elastic-orders-00|task-0] Node Discovery disabled... (io.searchbox.client.JestClientFactory:86)
INFO [sink-elastic-orders-00|task-0] Idle connection reaping disabled... (io.searchbox.client.JestClientFactory:98)

This change in logging format is disabled by default to maintain backward compatibility. To enable this improved logging, you need to edit etc/kafka/connect-log4j.properties and set the log4j.appender.stdout.layout.ConversionPattern as shown here:

log4j.appender.stdout.layout.ConversionPattern=[%d] %p %X{connector.context}%m (%c:%L)%n

Support for this has also been added to the Kafka Connect Docker images through the environment variable CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN.

For more details, see KIP-449.

Source: Kafka Connect Improvements in Apache Kafka 2.3

As such I'm closing this ticket as "Works as Designed".

If you feel our documentation would benefit from including this configuration, then please open a DOCSP ticket.

Ross

Generated at Thu Feb 08 09:05:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.