[SERVER-48026] Stop unconditionally throwing an error when a text search language is unsupported Created: 07/May/20  Updated: 27/Dec/23

Status: Backlog
Project: Core Server
Component/s: Text Search
Affects Version/s: 4.0.3
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Eric Guan Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qexec-team, qi-text-search
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Sprint: Query 2020-06-01
Participants:
Case:

 Description   

Context: https://stackoverflow.com/questions/61418548

I receive language codes from twitch.tv, and store in mongodb. I'd like to utilize text search on supported languages. For unsupported languages, I'd like mongo to just ignore it, instead of throwing an error. This way, I can use one field to store the language code.

Currently, it seems like I have to detect if mongo supports a lang, if not, I have to store it in another field. So now every document requires 2 fields, `language` for supported langs, and `language2` for unsupported langs. This is an ugly hack and I'd like a better solution.



 Comments   
Comment by Eric Guan [ 03/Jun/20 ]

> Is your request that the system automatically falls back to the default_language associated with the text index if the language code specified in a document's language_override field is not recognized?

If the language code specified in a document's language_override field is not recognized, I would like to still be able to store the document without error, without having to use another field for the unrecognized language code.

That is my only request. I've not considered what should happen after. If falling back to the default_language is the sane choice here, then I support it. 

> we would likely do so under a new opt-in flag specified when the index is created.

I agree, if implemented, it should be opt in.

> Are you aware of MongoDB Atlas Search?

Was not aware, but Atlas is way above my needs right now. 

 

Thanks for the consideration.

 

Comment by David Storch [ 03/Jun/20 ]

Hi guanzo91@gmail.com,

Thanks for the feature request! One question to make sure we understand correctly: Is your request that the system automatically falls back to the default_language associated with the text index if the language code specified in a document's language_override field is not recognized? I ask because I don't think it would be acceptable from our point of view to simply not index documents containing an unsupported language code. Although text indexes can also be partial, I don't think we want text indexes to be implicitly partial in order to ensure that $text search result sets are not partial.

Another thought is that if we were to implement this feature, we would likely do so under a new opt-in flag specified when the index is created. That would be necessary to avoid breaking applications which expect the system to enforce the invariant that documents containing invalid values in the language_override field are rejected.

Now that I've reviewed this request a bit more carefully, I'm going to put it back in the Server Query Team's queue for triage.

One final note: Are you aware of MongoDB Atlas Search? A beta-version of this feature is available for MongoDB 4.2 on Atlas. It's one of the places where our engineering team is investing resources around text search use cases right now.

Thanks,
Dave

Comment by Carl Champain (Inactive) [ 11/May/20 ]

Hi guanzo91@gmail.com,

Thanks for the report. I'm passing this ticket along to the appropriate team for additional investigation. Updates will be posted as they happen.

Kind regards,
Carl

Generated at Thu Feb 08 05:15:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.