<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:24:48 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-30726] inconsistent treatment of stopwords with language=&quot;none&quot;</title>
                <link>https://jira.mongodb.org/browse/SERVER-30726</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;I am using mongo db 3.2.8 on mac os. Here is a bug I&apos;m facing when using language=&quot;none&quot;. My desired result was that when setting language=&quot;none&quot;, mongo DB will stop ignoring stopwords.&lt;/p&gt;

&lt;p&gt;I have created a collection &apos;resources&apos; with text index on a field called &apos;title&apos; with default language (English.)&lt;/p&gt;

&lt;p&gt;First, I create a document with title = &apos;What are you?&apos;. &lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;When I search with db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;what are you?&apos;}}) I get no results as expected because all the words are stopwords.&lt;/li&gt;
	&lt;li&gt;When I search with&lt;br/&gt;
db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;what are you?&apos;}}, $language:&quot;none&quot;), I still get no results. Ideally, I&apos;d like this to return the document.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Now I modified the document to have title = &apos;Whats are you?&apos; (notice the typo &apos;Whats&apos;)&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;When I search with&lt;br/&gt;
db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;whats are you?&apos;}}, $language:&quot;none&quot;), I do not get this document even though there is an exact match and &apos;whats&apos; is presumably not a stopword! &lt;/li&gt;
	&lt;li&gt;When I search with&lt;br/&gt;
db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;what are you?&apos;}}, $language:&quot;none&quot;), I get this document in return!&lt;/li&gt;
	&lt;li&gt;Language=&quot;en&quot; is working fine: when I search with db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;whats are you?&apos;}}) I get the result and when searching with db.getCollection(&apos;resources&apos;).find({$text:{$search: &apos;what are you?&apos;}}), I don&apos;t.&lt;/li&gt;
&lt;/ol&gt;
</description>
                <environment></environment>
        <key id="417740">SERVER-30726</key>
            <summary>inconsistent treatment of stopwords with language=&quot;none&quot;</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="13202">Works as Designed</resolution>
                                        <assignee username="kyle.suarez@mongodb.com">Kyle Suarez</assignee>
                                    <reporter username="rajhans">Rajhans Samdani</reporter>
                        <labels>
                    </labels>
                <created>Thu, 17 Aug 2017 23:07:09 +0000</created>
                <updated>Fri, 27 Oct 2023 13:54:18 +0000</updated>
                            <resolved>Fri, 1 Dec 2017 16:13:22 +0000</resolved>
                                                                    <component>Text Search</component>
                                        <votes>0</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="1705791" author="rajhans" created="Mon, 23 Oct 2017 04:03:46 +0000"  >&lt;p&gt;Makes sense. Thanks!&lt;br/&gt;
Feel free to close this issue.&lt;/p&gt;</comment>
                            <comment id="1678157" author="thomas.schubert" created="Wed, 20 Sep 2017 21:23:08 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=rajhans&quot; class=&quot;user-hover&quot; rel=&quot;rajhans&quot;&gt;rajhans&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Sorry for the delay getting back to you. The issue you&apos;re observing is the result of how text indexes are currently stored and queried against.&lt;/p&gt;

&lt;p&gt;When text indexes are created, mongod will stem each word that isn&apos;t a stop word according to the language rules. These stemmed words are the tokens used as the index keys for all subsequent queries and point to their corresponding documents.&lt;/p&gt;

&lt;p&gt;When MongoDB queries a text index, it first removes the stop words and then stems the remaining words in the phrase according to the language rules that were passed in (or falls back to the default of english).&lt;/p&gt;

&lt;p&gt;If the stop word/stemming rules the text index differ from the stop word/stemming rules of the query (e.g. a  language is used for the query other than the one the index was built with), it is possible that MongoDB be will not find a match since the roots from the stemmed query do not match any tokens in the index. &lt;/p&gt;

&lt;p&gt;In your first example, no token &quot;what&quot; is inserted into the index when the document is inserted, as consequence no subsequent query with the root &quot;what&quot; will find the document containing &quot;what&quot;.&lt;/p&gt;

&lt;p&gt;In your second example, &quot;whats&quot; is stemmed to &quot;what&quot; according to english stemming rules, and mongod updates its text index to include the token &quot;what&quot; pointing to newly inserted document. When mongod does not stem (according to the language rules of none) &quot;whats&quot; it queries on &quot;whats&quot; and cannot find any matching tokens/documents. However, if the query is stemmed to &quot;what&quot; then the document is returned as expected.&lt;/p&gt;

&lt;p&gt;We&apos;re actively discussing ways to improve this behavior. Please feel free to review a related ticket, &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-29918&quot; title=&quot;stemming behavior for diacritics causes incorrect results&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-29918&quot;&gt;&lt;del&gt;SERVER-29918&lt;/del&gt;&lt;/a&gt;, which discusses a similar stemming issue. For now, I would recommend using the same language for both your queries and text index.&lt;/p&gt;

&lt;p&gt;Kind regards,&lt;br/&gt;
Kelsey&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="399222">SERVER-29918</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18555" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname># of Sprints</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 14 Sep 2017 20:35:40 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 16 weeks, 3 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>luke.bonanomi@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 16 weeks, 3 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>kelsey.schubert@mongodb.com</customfieldvalue>
            <customfieldvalue>kyle.suarez@mongodb.com</customfieldvalue>
            <customfieldvalue>rajhans</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htd7if:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hta1kn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10557" key="com.pyxis.greenhopper.jira:gh-sprint">
                        <customfieldname>Sprint</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue id="1924">Query 2017-10-23</customfieldvalue>
    <customfieldvalue id="1952">Query 2017-11-13</customfieldvalue>
    <customfieldvalue id="1979">Query 2017-12-04</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[kelsey.schubert@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|htctlb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>