<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 04:21:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-29598] Support Korean language in full text search</title>
                <link>https://jira.mongodb.org/browse/SERVER-29598</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Add Korean to languages supported in MongoDB FTS.&lt;/p&gt;

&lt;p&gt;Original description:&lt;br/&gt;
First of all, MongoDB support stemming for major language like english.&lt;br/&gt;
But there&apos;s no stemming for CJK (Especially I am focusing on Korean). So MongoDB text search is useless for korean language unless stemming Korean in application code.&lt;/p&gt;

&lt;p&gt;I am not sure you are interested in Korean,&lt;br/&gt;
Anyway Korean use only suffix(postpositional word) after stem(base word) like ..&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;Stem : &#54620;&#44544;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;With suffix : &#54620;&#44544;&#51008;, &#54620;&#44544;&#51060;, &#54620;&#44544;&#51012;, &#54620;&#44544;&#44284;, &#54620;&#44544;&#46020;, &#54620;&#44544;&#52376;&#47100;, ...&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;But current MongoDB implementation, MongoDB search exact match with search term. So Korean word does not matched because of suffix(&quot;&#51008;&quot;, &quot;&#45716;&quot;, &quot;&#51060;&quot;, &quot;&#44032;&quot;, &quot;&#52376;&#47100;&quot;, ...)&lt;/p&gt;

&lt;p&gt;So if MongoDB support range search for text search like below example, We (Korean) can use text-search for Korean language.&lt;/p&gt;
&lt;p/&gt;
&lt;div id=&quot;syntaxplugin&quot; class=&quot;syntaxplugin&quot; style=&quot;border: 1px dashed #bbb; border-radius: 5px !important; overflow: auto; max-height: 30em;&quot;&gt;
&lt;table cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; border=&quot;0&quot; width=&quot;100%&quot; style=&quot;font-size: 1em; line-height: 1.4em !important; font-weight: normal; font-style: normal; color: black;&quot;&gt;
		&lt;tbody &gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;  margin-top: 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;Text : &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;&#54620;&#44544;&#51008; &#46832;&#50612;&#45212; &#50616;&#50612;&#51077;&#45768;&#45796;.&quot;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;Search term : &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;&#54620;&#44544;&quot;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;Range search in Text-search : &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;&#54620;&#44544;&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; &amp;lt;= range &amp;lt; &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;&#54620;&#44545;&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; &lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
				&lt;tr id=&quot;syntaxplugin_code_and_gutter&quot;&gt;
						&lt;td  style=&quot; line-height: 1.4em !important; padding: 0em; vertical-align: top;&quot;&gt;
					&lt;pre style=&quot;font-size: 1em; margin: 0 10px;   margin-bottom: 10px;  width: auto; padding: 0;&quot;&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;  (where &lt;/span&gt;&lt;span style=&quot;color: blue; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&quot;&#54620;&#44545;&quot;&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt; is generated simple increment of last character of search term, [like &lt;/span&gt;&lt;span style=&quot;color: #006699; font-weight: bold; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;this&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;|https:&lt;/span&gt;&lt;span style=&quot;color: #008200; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;//github.com/mongodb/mongo/pull/1151/commits/641c3041282746aff280b685424d55926bab93b2#diff-bc6db30f2a5f9618496534d03aeabf54R108])&lt;/span&gt;&lt;span style=&quot;color: black; font-family: &apos;Consolas&apos;, &apos;Bitstream Vera Sans Mono&apos;, &apos;Courier New&apos;, Courier, monospace !important;&quot;&gt;&lt;/span&gt;&lt;/pre&gt;
			&lt;/td&gt;
		&lt;/tr&gt;
			&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p/&gt;

&lt;p&gt;Of course, this feature is not needed for language which has stemming.&lt;br/&gt;
So I want you add knob to enable or disable this range search for text-search (and default is false). Then we can use text search with this knob=true for Korean language.&lt;/p&gt;

&lt;p&gt;I pushed pull-request for this simple idea to &lt;a href=&quot;https://github.com/mongodb/mongo/pull/1151&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;MongoDB github&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This feature will save a lot of Korean guys. Please consider adding this feature seriously.&lt;br/&gt;
(I am not sure this feature is useful for Japanese or China which does not have space in phrase)&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</description>
                <environment></environment>
        <key id="393273">SERVER-29598</key>
            <summary>Support Korean language in full text search</summary>
                <type id="2" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14711&amp;avatarType=issuetype">New Feature</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="10038" iconUrl="https://jira.mongodb.org/images/icons/subtask.gif" description="">Backlog</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="backlog-query-integration">Backlog - Query Integration</assignee>
                                    <reporter username="sunguck.lee@gmail.com">&#50500;&#45208; &#54616;&#47532;</reporter>
                        <labels>
                            <label>qi-text-search</label>
                    </labels>
                <created>Tue, 13 Jun 2017 08:44:26 +0000</created>
                <updated>Wed, 27 Dec 2023 16:46:08 +0000</updated>
                                                                            <component>Text Search</component>
                                        <votes>6</votes>
                                    <watches>14</watches>
                                                                                                                <comments>
                            <comment id="1611238" author="sunguck.lee@gmail.com" created="Fri, 30 Jun 2017 08:55:47 +0000"  >&lt;p&gt;Hi Asya.&lt;/p&gt;

&lt;p&gt;&amp;gt;&amp;gt; I&apos;m going to convert this ticket into a new feature request for MongoDB to add proper text search support for Korean language.&lt;br/&gt;
Sure, I just added simple code to explain how mongodb can support korean full text search without stemming. And also my pull-request is not complete patch.&lt;/p&gt;

&lt;p&gt;Anyway, I hope &quot;&lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-15090&quot; title=&quot;Improve Text Indexes to support partial word match&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-15090&quot;&gt;SERVER-15090&lt;/a&gt;&quot; is implemented sooner or later.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;</comment>
                            <comment id="1609907" author="asya" created="Wed, 28 Jun 2017 21:33:22 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=matt.lee&quot; class=&quot;user-hover&quot; rel=&quot;matt.lee&quot;&gt;matt.lee&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;You are correct, MongoDB text search currently does not provide support for Korean (you can see the list of currently supported languages &lt;a href=&quot;https://docs.mongodb.com/manual/reference/text-search-languages/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The best solution would be for us to add support for Korean, which would include support for appropriate stemming and stop words.   As you found, if the language is not supported, text search uses simple tokenization with no list of stop words and no stemming.&lt;/p&gt;

&lt;p&gt;Your proposed pull request tries to implement prefix text search, a new feature we are already tracking in &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-15090&quot; title=&quot;Improve Text Indexes to support partial word match&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-15090&quot;&gt;SERVER-15090&lt;/a&gt;, however, we cannot accept the pull request for several reasons:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;there are no tests included, so there is no way to make sure that the changes didn&apos;t break existing functionality&lt;/li&gt;
	&lt;li&gt;text indexes can be &lt;a href=&quot;https://docs.mongodb.com/manual/core/index-text/#text-index-compound&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;part of compound indexes&lt;/a&gt; and the proposed changes don&apos;t look like they would work correctly with a compound index&lt;/li&gt;
	&lt;li&gt;please see our &lt;a href=&quot;https://github.com/mongodb/mongo/wiki#contributing-to-mongodb&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;contributor guidelines&lt;/a&gt; for other requirements, like contributor agreement, coding style, etc.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Since we already have a JIRA ticket for prefix search, I think the proposed work for that feature should be tracked there.  I&apos;m going to convert this ticket into a new feature request for MongoDB to add proper text search support for Korean language.&lt;/p&gt;

&lt;p&gt;Thanks for your interest in MongoDB.&lt;/p&gt;

&lt;p&gt;Regards,&lt;br/&gt;
Asya Kamsky&lt;br/&gt;
Lead Product Manager, MongoDB Server&lt;/p&gt;</comment>
                            <comment id="1599364" author="mark.agarunov" created="Fri, 16 Jun 2017 16:41:10 +0000"  >&lt;p&gt;Hello &lt;a href=&quot;https://jira.mongodb.org/secure/ViewProfile.jspa?name=matt.lee&quot; class=&quot;user-hover&quot; rel=&quot;matt.lee&quot;&gt;matt.lee&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;Thank you for providing the detailed example. I&apos;ve set the fixVersion on this ticket to &quot;Needs Triage&quot; for this new feature to be scheduled against our currently planned work. Updates will be posted on this ticket as they happen.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
Mark&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                        <issuelink>
            <issuekey id="1121857">SERVER-45859</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>3.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                <customfield id="customfield_12751" key="com.atlassian.jira.plugin.system.customfieldtypes:multiselect">
                        <customfieldname>Assigned Teams</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="25467"><![CDATA[Query Integration]]></customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_13552" key="com.go2group.jira.plugin.crm:crm_generic_field">
                        <customfieldname>Case</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[[500A000000XWvtzIAD]]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Fri, 16 Jun 2017 16:41:10 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        6 years, 32 weeks, 5 days ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ted.tuckman@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            6 years, 32 weeks, 5 days ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>asya.kamsky@mongodb.com</customfieldvalue>
            <customfieldvalue>backlog-query-integration</customfieldvalue>
            <customfieldvalue>mark.agarunov</customfieldvalue>
            <customfieldvalue>sunguck.lee@gmail.com</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht92vj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hr2f5z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|ht8oxz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>