[KAFKA-252] Add regular expressions in topic mapping Created: 24/Sep/21  Updated: 09/Aug/23  Resolved: 27/Jul/23

Status: Closed
Project: Kafka Connector
Component/s: Configuration
Affects Version/s: None
Fix Version/s: 1.11.0

Type: New Feature Priority: Unknown
Reporter: Jean-Francois Guena Assignee: Valentin Kavalenka
Resolution: Done Votes: 0
Labels: external-user
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
Duplicate
is duplicated by KAFKA-280 Support database.wildcard in DefaultT... Closed
Quarter: FY24Q2
Case:
Documentation Changes: Needed

 Description   

This is about the pull request I have submitted some days ago here: https://github.com/mongodb/mongo-kafka/pull/82

At the time, I was not able to access this Jira project to submit a new feature before submitting the PR... Sorry.

The main idea is to allow regular expressions in 'topic.namespace.map' property, so that, for example, every database starting with "myDB-" should be mapped to a certain topic (this is a very simple example of course)

Hope the PR will be reviewed soon...

Regards



 Comments   
Comment by Robert Walters [ 26/Jul/23 ]

SGTM it appears that using / will still obtain the objective as stated by the PR

Comment by Valentin Kavalenka [ 17/Jul/23 ]

Keys in the topic.namespace.map (which is really a BSON document represented as extended JSON text that must be parseable by Document.parse) must start with a MongoDB database name to be used, otherwise, their values are ignored by the connector. This follows from the current behavior of DefaultTopicMapper. According to https://www.mongodb.com/docs/manual/reference/limits/#naming-restrictions, a database name cannot contain / (forward slash) in any of the currently supported operating systems. These two facts allow us to assume that keys in the topic.namespace.map do not currently contain / (if they do, they are not names of a database, and, therefore, are not used). Therefore, instead of introducing a new TopicMapper implementation, as suggested in the original PR linked to the ticket, we may change the behavior of DefaultTopicMapper in a backward-compatible way: if a key starts with / (note that we don't require it to end with /, as it was suggested in the original PR, because I see no reason to), we consider that key to be a regular expression1 with the syntax specified by https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html, otherwise we threat it as we currently do. The benefit of this approach is that users will be able to mix the current and regexp-based configuration without having to switch to a different TopicMapper.

As for the rest of the functionality, I think we should adopt what was proposed in the original PR.

robert.walters@mongodb.com, ross@mongodb.com (Ross is actually on vacation, so for now the question is mostly for Robert) does that sound good?


1 If there are configurations in the wild with topic.namespace.map that has a key that starts with /, that key is currently guaranteed to be ignored by the connector, as was described above. However, if we start using / as a regexp marker, that may suddenly change how the connector behaves with those configurations. Given all that I explained above, I think it is acceptable to assume that such keys do not exist in the wild, and warn the users in what's new about what may happen if such keys exist (see https://jira.mongodb.org/browse/DOCSP-31502).

Comment by Robert Walters [ 18/Oct/21 ]

Thank you for your submission, we will consider this for 1.8.0

Generated at Thu Feb 08 09:05:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.