<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 06:52:59 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-83728] Full Text Search incorrectly matches partial ID-like phrases</title>
                <link>https://jira.mongodb.org/browse/SERVER-83728</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;It&#8217;s my understanding that &lt;tt&gt;$text&lt;/tt&gt; searches do not match partial words or phrases. However, when I am searching in my MongoDB with a text index, it is returning matches that partially match an ID-like search query, for example:&lt;/p&gt;

&lt;p&gt;This search:&lt;/p&gt;

&lt;p&gt;{{{ $text: {$search: &apos;&quot;23-X-1&quot;&apos; }}}}&lt;/p&gt;

&lt;p&gt;Matches this document:&lt;/p&gt;

&lt;p&gt;{{&lt;/p&gt;
{ &quot;value&quot;: &quot;123-X-1 Hello&quot; }
&lt;p&gt;}}&lt;/p&gt;

&lt;p&gt;Especially since it is quoted, I&#8217;d expect it to not match, since the phrase &lt;tt&gt;&quot;23-X-1&quot;&lt;/tt&gt; doesn&#8217;t appear isolated anywhere in the value, but only appears as a partial match of the greater phrase &lt;tt&gt;&quot;123-X-1&quot;&lt;/tt&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="2512419">SERVER-83728</key>
            <summary>Full Text Search incorrectly matches partial ID-like phrases</summary>
                <type id="1" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14703&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="11262" iconUrl="https://jira.mongodb.org/images/icons/statuses/generic.png" description="">Investigating</status>
                    <statusCategory id="4" key="indeterminate" colorName="inprogress"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="ted.tuckman@mongodb.com">Ted Tuckman</assignee>
                                    <reporter username="eric@ericmakesapps.com">Eric Ferreira</reporter>
                        <labels>
                            <label>qi-text-search</label>
                            <label>search</label>
                            <label>text</label>
                            <label>text_index</label>
                    </labels>
                <created>Wed, 29 Nov 2023 21:38:28 +0000</created>
                <updated>Thu, 11 Jan 2024 19:37:23 +0000</updated>
                                            <version>6.0.5</version>
                                                                        <votes>0</votes>
                                    <watches>1</watches>
                                                                                                                <comments>
                            <comment id="5975197" author="JIRAUSER1274755" created="Fri, 29 Dec 2023 14:43:22 +0000"  >&lt;p&gt;Hi Team,&lt;/p&gt;

&lt;p&gt;There was some initial confusion on my part as to what the issue was here, but it looks like the user is reporting that tokenization is not applying to numerical values and is uncertain as to whether or not that is intentional. The end result of the current behavior(confirmed, and can be replicated using a simple replication js) is that strings that contain numbers and dashes are partially matched when perhaps they should not be (the users example being &lt;b&gt;&lt;tt&gt;123-X-1&lt;/tt&gt;&lt;/b&gt; when &lt;tt&gt;$text&lt;/tt&gt; is given &lt;b&gt;&lt;tt&gt;&quot;23-X-1&quot;&lt;/tt&gt;&lt;/b&gt; to search for). Is this intended?&lt;/p&gt;</comment>
                            <comment id="5975190" author="JIRAUSER1274755" created="Fri, 29 Dec 2023 14:39:53 +0000"  >&lt;p&gt;Hello &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=eric%40ericmakesapps.com&quot; class=&quot;user-hover&quot; rel=&quot;eric@ericmakesapps.com&quot;&gt;eric@ericmakesapps.com&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for explaining in more detail, that does clarify your report. As for the behavior that you&apos;ve reported, I have confirmed that using your instructions I was able to replicate said behavior. I&apos;m going to move this ticket over to the appropriate team to confirm whether or not this behavior is expected.&lt;/p&gt;</comment>
                            <comment id="5971136" author="JIRAUSER1275311" created="Tue, 26 Dec 2023 21:49:14 +0000"  >&lt;p&gt;In this particular case, the search term is an &lt;a href=&quot;https://www.mongodb.com/docs/manual/core/link-text-indexes/#search-for-an-exact-phrase&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;&quot;Exact Phrase&quot;&lt;/a&gt;, so tokenization of search terms should not come into play here.&lt;/p&gt;

&lt;p&gt;This issue is more about matches being found seemingly using partial word matching, where-as the fact that &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-15090&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;#15090&lt;/a&gt; is still open strongly implies that it should not match based on partial word matching, as contrasted with Atlas deployments that can support partial (or fuzzy) searching for text searches.&lt;/p&gt;

&lt;p&gt;The crux is, I&#8217;d not expect the search term to be found since it&#8217;s not present in the text being searched, even when the text being search is tokenized by whitespace and/or punctuation. It seems to split apart numbers (it finds a match for 23 in the number 123, but that should not be the case without partial matching, right?).&lt;/p&gt;

&lt;p&gt;I don&#8217;t know if I&#8217;m explaining it very well, but let me know if that is clear(er).&lt;/p&gt;</comment>
                            <comment id="5971111" author="JIRAUSER1274755" created="Tue, 26 Dec 2023 21:33:04 +0000"  >&lt;p&gt;Hi &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=eric%40ericmakesapps.com&quot; class=&quot;user-hover&quot; rel=&quot;eric@ericmakesapps.com&quot;&gt;eric@ericmakesapps.com&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Apologies for the incomplete link list, the documentation that I was referring to is as follows:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;https://www.mongodb.com/docs/manual/core/text-search-operators/#std-label-text-search-operators-on-premises&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.mongodb.com/docs/manual/core/text-search-operators/#std-label-text-search-operators-on-premises&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://www.mongodb.com/docs/manual/core/link-text-indexes/#std-label-text-search-on-premises&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.mongodb.com/docs/manual/core/link-text-indexes/#std-label-text-search-on-premises&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://www.mongodb.com/docs/manual/core/indexes/index-types/index-text/#std-label-index-type-text&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.mongodb.com/docs/manual/core/indexes/index-types/index-text/#std-label-index-type-text&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;https://www.mongodb.com/docs/manual/tutorial/text-search-in-aggregation/#std-label-text-agg&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://www.mongodb.com/docs/manual/tutorial/text-search-in-aggregation/#std-label-text-agg&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;As noted in our &quot;Perform a Text Search&quot; documentation, tokenization is performed on the provided terms (which itself links to further documentation).&#160;&lt;/p&gt;</comment>
                            <comment id="5971033" author="JIRAUSER1275311" created="Tue, 26 Dec 2023 20:50:30 +0000"  >&lt;p&gt;Is anyone alive around here? This is was incorrectly closed, as far as I can tell. &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=rhea.thorne%40mongodb.com&quot; class=&quot;user-hover&quot; rel=&quot;rhea.thorne@mongodb.com&quot;&gt;rhea.thorne@mongodb.com&lt;/a&gt; says the documentation page mentions tokenization, as if that should cause the text search to match partial phrases, but that doesn&#8217;t really seem the case.&lt;/p&gt;</comment>
                            <comment id="5955482" author="JIRAUSER1275311" created="Sat, 16 Dec 2023 14:31:10 +0000"  >&lt;p&gt;Can you please reopen this, or link to the exact part of the documentation that talks about this tokenization on text search that causes it to match partial words, contrary to the known current behavior?&lt;/p&gt;</comment>
                            <comment id="5953574" author="JIRAUSER1275311" created="Fri, 15 Dec 2023 14:53:27 +0000"  >&lt;p&gt;I mean, I read through the whole documentation before creating this ticket. It obviously is&#160;&lt;em&gt;how&lt;/em&gt; it currently behaves, but I wouldn&apos;t say it&#8217;s expected. Everything in the documentation is geared towards &quot;words&quot; and &quot;phrases&quot;, nowhere implying that it could or would match a partial word. In fact, there&apos;s an open work item about adding the ability to match partial words expressly for that reason (&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-15090&quot; class=&quot;external-link&quot; rel=&quot;nofollow&quot;&gt;#15090&lt;/a&gt;). This definitely leads me to believe that $text should &lt;em&gt;not&lt;/em&gt; match partial words in the content.&lt;/p&gt;</comment>
                            <comment id="5953533" author="JIRAUSER1274755" created="Fri, 15 Dec 2023 14:43:27 +0000"  >&lt;p&gt;Hello &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=eric%40ericmakesapps.com&quot; class=&quot;user-hover&quot; rel=&quot;eric@ericmakesapps.com&quot;&gt;eric@ericmakesapps.com&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for your report. The behavior that you&apos;ve described is the intended behavior. &lt;tt&gt;$text&lt;/tt&gt; searches tokenize your search items, and will search the given fields for appearances of said search items. You can read more about this on our &lt;a href=&quot;https://www.mongodb.com/docs/current/reference/operator/query/text/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;documentation&lt;/a&gt; page.&lt;/p&gt;

&lt;p&gt;At this time, I&apos;ll be closing this ticket as the behavior described is intended.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="500019" name="repro.js" size="443" author="rhea.thorne@mongodb.com" created="Fri, 29 Dec 2023 14:44:44 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25467"><![CDATA[Query Integration]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 15 Dec 2023 14:43:27 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        5 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ted.tuckman@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            5 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10032" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Operating System</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10026"><![CDATA[ALL]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>rhea.thorne@mongodb.com</customfieldvalue>
            <customfieldvalue>eric@ericmakesapps.com</customfieldvalue>
            <customfieldvalue>ted.tuckman@mongodb.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i32iuf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|i2kbi4:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10750" key="com.atlassian.jira.plugin.system.customfieldtypes:textarea">
                        <customfieldname>Steps To Reproduce</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>&lt;ol&gt;
	&lt;li&gt;Add a document to a collection, something like {{
{ &quot;value&quot;: &quot;123-X-1 Hello&quot; }
&lt;p&gt;}}&lt;/p&gt;&lt;/li&gt;
	&lt;li&gt;Add a text index on the &lt;tt&gt;value&lt;/tt&gt; property&lt;/li&gt;
	&lt;li&gt;Perform a text search with something like {{
{ $text: \{$search: &apos;&quot;23-X-1&quot;&apos; }
&lt;p&gt;}}}&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                    <customfieldvalue><![CDATA[rhea.thorne@mongodb.com]]></customfieldvalue>
    

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i324zr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>