[SERVER-30253] diacriticSensitive question Created: 21/Jul/17  Updated: 10/Aug/17  Resolved: 10/Aug/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.4.5
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Juan Antonio Roy Couto Assignee: Kyle Suarez
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-29918 stemming behavior for diacritics caus... Closed
Sprint: Query 2017-07-31, Query 2017-08-21
Participants:

 Description   

Hi.
This is my text index:

db.test.createIndex({ 'description' : 'text' }, { default_language : 'spanish' })

This is my document:

db.test.find({ _id : 1 })
{ "_id" : 1, "description" : "Obtención de financiación" }

These are my questions:

db.test.find({ $text : { $search : 'Obtencion' } });
{ "_id" : 1, "description" : "Obtención de financiación" }
> db.test.find({ $text : { $search : 'financiacion' } });
>

Why I do not receive the same document as a result of my last command?
Why "diacriticSensitive : false" does not work when the searched word (financiacion) is not located at the very beginning of the field?
Is this a issue or I do not understand something?
¡Thank you very much!



 Comments   
Comment by Juan Antonio Roy Couto [ 10/Aug/17 ]

Thank you @Kyle Suarez!

Comment by Kyle Suarez [ 10/Aug/17 ]

Hello juanroy,

After doing some investigation, I believe that the root cause of this issue is the same as SERVER-29918. Diacritics are not stripped from words before stemming, and there is a querying disparity based on how words get stemmed. Further in-depth investigation will be done in the linked ticket.

I'm closing this issue as a duplicate of SERVER-29918. Please watch that ticket for updates.

Regards,
Kyle

Generated at Thu Feb 08 04:23:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.