[SERVER-48026] Stop unconditionally throwing an error when a text search language is unsupported Created: 07/May/20 Updated: 27/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Text Search |
| Affects Version/s: | 4.0.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Eric Guan | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | qexec-team, qi-text-search | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Integration
|
| Sprint: | Query 2020-06-01 |
| Participants: | |
| Case: | (copied to CRM) |
| Description |
|
Context: https://stackoverflow.com/questions/61418548 I receive language codes from twitch.tv, and store in mongodb. I'd like to utilize text search on supported languages. For unsupported languages, I'd like mongo to just ignore it, instead of throwing an error. This way, I can use one field to store the language code. Currently, it seems like I have to detect if mongo supports a lang, if not, I have to store it in another field. So now every document requires 2 fields, `language` for supported langs, and `language2` for unsupported langs. This is an ugly hack and I'd like a better solution. |
| Comments |
| Comment by Eric Guan [ 03/Jun/20 ] |
|
> Is your request that the system automatically falls back to the default_language associated with the text index if the language code specified in a document's language_override field is not recognized? If the language code specified in a document's language_override field is not recognized, I would like to still be able to store the document without error, without having to use another field for the unrecognized language code. That is my only request. I've not considered what should happen after. If falling back to the default_language is the sane choice here, then I support it. > we would likely do so under a new opt-in flag specified when the index is created. I agree, if implemented, it should be opt in. > Are you aware of MongoDB Atlas Search? Was not aware, but Atlas is way above my needs right now.
Thanks for the consideration.
|
| Comment by David Storch [ 03/Jun/20 ] |
|
Thanks for the feature request! One question to make sure we understand correctly: Is your request that the system automatically falls back to the default_language associated with the text index if the language code specified in a document's language_override field is not recognized? I ask because I don't think it would be acceptable from our point of view to simply not index documents containing an unsupported language code. Although text indexes can also be partial, I don't think we want text indexes to be implicitly partial in order to ensure that $text search result sets are not partial. Another thought is that if we were to implement this feature, we would likely do so under a new opt-in flag specified when the index is created. That would be necessary to avoid breaking applications which expect the system to enforce the invariant that documents containing invalid values in the language_override field are rejected. Now that I've reviewed this request a bit more carefully, I'm going to put it back in the Server Query Team's queue for triage. One final note: Are you aware of MongoDB Atlas Search? A beta-version of this feature is available for MongoDB 4.2 on Atlas. It's one of the places where our engineering team is investing resources around text search use cases right now. Thanks, |
| Comment by Carl Champain (Inactive) [ 11/May/20 ] |
|
Thanks for the report. I'm passing this ticket along to the appropriate team for additional investigation. Updates will be posted as they happen. Kind regards, |