<!-- 
RSS generated by JIRA (9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66) at Thu Feb 08 03:01:37 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>MongoDB Jira</title>
    <link>https://jira.mongodb.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.7.1</version>
        <build-number>970001</build-number>
        <build-date>13-04-2023</build-date>
    </build-info>


<item>
            <title>[SERVER-2939] Support Unicode fully in the Mongo shell (was &quot;Linenoise UTF8 support&quot;)</title>
                <link>https://jira.mongodb.org/browse/SERVER-2939</link>
                <project id="10000" key="SERVER">Core Server</project>
                    <description>&lt;p&gt;Allow entry and display of Unicode characters and ensure correct handling of Unicode in all interactions with the server.&lt;/p&gt;</description>
                <environment></environment>
        <key id="16088">SERVER-2939</key>
            <summary>Support Unicode fully in the Mongo shell (was &quot;Linenoise UTF8 support&quot;)</summary>
                <type id="4" iconUrl="https://jira.mongodb.org/secure/viewavatar?size=xsmall&amp;avatarId=14710&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="3" iconUrl="https://jira.mongodb.org/images/icons/priorities/major.svg">Major - P3</priority>
                        <status id="6" iconUrl="https://jira.mongodb.org/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="9">Done</resolution>
                                        <assignee username="tad">Tad Marshall</assignee>
                                    <reporter username="mathias@mongodb.com">Mathias Stearn</reporter>
                        <labels>
                    </labels>
                <created>Tue, 12 Apr 2011 20:57:50 +0000</created>
                <updated>Tue, 12 Jul 2016 00:20:26 +0000</updated>
                            <resolved>Wed, 13 Jun 2012 19:55:32 +0000</resolved>
                                                    <fixVersion>2.1.2</fixVersion>
                                    <component>Shell</component>
                                        <votes>4</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="132051" author="tad" created="Wed, 13 Jun 2012 19:46:13 +0000"  >&lt;p&gt;The issue referenced above (Jun 04 2012 01:17:08 PM UTC), where console output in Windows wasn&apos;t handled correctly, is now fixed.&lt;/p&gt;

&lt;p&gt;The remaining issues in the UTF-8/Unicode feature are:&lt;br/&gt;
1)  Combining characters will not interact with cursor movement correctly.  They will display correctly, but the cursor position will be offset.  The combining characters need to be treated as &quot;zero width&quot; to do this properly.&lt;br/&gt;
2)  Chinese, Japanese and Korean characters may display taking up two screen positions, but cursor positioning will not be correct.  These characters need to be treated as &quot;double width&quot; to do this properly.&lt;/p&gt;

&lt;p&gt;I&apos;m going to resolve this ticket and file a new one for the remaining issues.&lt;/p&gt;</comment>
                            <comment id="131999" author="auto" created="Wed, 13 Jun 2012 18:52:40 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;date&apos;: u&apos;2012-06-13T02:03:21-07:00&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; fix Windows console output&lt;/p&gt;

&lt;p&gt;For Windows, when writing to the console, convert text to UTF-16 and&lt;br/&gt;
write it to the screen using WriteConsoleW instead of fwrite or _write.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/70546ba57409051eeef817304955a411f46b763b&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/70546ba57409051eeef817304955a411f46b763b&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="125653" author="tad" created="Mon, 4 Jun 2012 13:17:08 +0000"  >&lt;p&gt;The _write calls are still problematic in the Windows console.  The fwrite is unusable because of its sending the first byte of a UTF-8 character to the console as a single write, leading to corrupted display.  But the _write call is also challenging because of two features:&lt;br/&gt;
1)  It won&apos;t necessarily write everything you ask it to write in a single call, so you need to check the return value and call it again if some of your data has not yet been written;&lt;br/&gt;
2)  It returns the number of &lt;b&gt;characters&lt;/b&gt; written, not the number of bytes, so if you are sending it UTF-8 (multi-byte-per-character) data then you need to parse the UTF-8 data to figure out what the return value is telling you;&lt;br/&gt;
3)  Because of the interaction of (1) and (2), it can split UTF-8 characters, causing display corruption.&lt;br/&gt;
This needs to be fixed for version 2.1.2.  The likely fix is to change to using the Windows Console API instead of the C runtime functions.&lt;/p&gt;</comment>
                            <comment id="118854" author="auto" created="Sun, 13 May 2012 01:11:49 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;tadmarshall&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; Windows console _write may not write full buffer&lt;/p&gt;

&lt;p&gt;Update my previous change to deal with calls to _write that&lt;br/&gt;
don&apos;t write the requested length.  Loop until all characters&lt;br/&gt;
have been written.  Affects long output strings from JavaScript.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/4951e568672740b8bd783402afcb03dfd2db1d9c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/4951e568672740b8bd783402afcb03dfd2db1d9c&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="112366" author="tad" created="Sun, 22 Apr 2012 09:48:07 +0000"  >&lt;p&gt;Most of this feature is in 2.1.1.  The remaining part, to handle zero-width and double-width characters (for combining characters and wide CJK characters) can go into 2.1.2.&lt;/p&gt;</comment>
                            <comment id="110148" author="auto" created="Mon, 16 Apr 2012 11:22:19 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;tadmarshall&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; UTF-8 support for the shell&lt;/p&gt;

&lt;p&gt;Major reworking of the internals of linenoise to support UTF-8.  Added&lt;br/&gt;
Utf8String and Utf32String classes adapted from code by Mathias.  Start&lt;br/&gt;
of work to handle zero-width and double-width characters (for combining&lt;br/&gt;
characters and Chinese-Japanese-Korean wide characters) using code from&lt;br/&gt;
Markus Kuhn (called mk_wcwidth as checked in here).  Some additional&lt;br/&gt;
cleanup would be desirable, but all features should now work with Unicode&lt;br/&gt;
in Windows and non-Windows builds.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/fc923dbad70755f8c98a8774299bd5061454a69d&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/fc923dbad70755f8c98a8774299bd5061454a69d&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="101543" author="auto" created="Thu, 22 Mar 2012 18:55:22 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;tadmarshall&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; Supporting code for UTF-8 in the shell&lt;/p&gt;

&lt;p&gt;This commit lets the shell read UTF-8 from the command&lt;br/&gt;
line and fixes a display problem with lines that start&lt;br/&gt;
with a UTF-8 character.  It does not include the actual&lt;br/&gt;
UTF-8 enabling in linenoise, but prepares for it.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/3363b199e72a7b41c965fea9483d7825fd353d70&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/3363b199e72a7b41c965fea9483d7825fd353d70&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="101345" author="auto" created="Thu, 22 Mar 2012 14:06:27 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;tadmarshall&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; Add linenoise_utf8.cpp and linenoise_utf.h&lt;/p&gt;

&lt;p&gt;New files for UTF-8 support in the shell.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/95e4630a7ef89cc7030544984cc4ed5f09408e67&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/95e4630a7ef89cc7030544984cc4ed5f09408e67&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="87217" author="tad" created="Fri, 10 Feb 2012 12:01:06 +0000"  >&lt;p&gt;Yes, the current handling of UTF-8 is very poor, and what you&apos;re seeing is the result of code that acts as if everything is ASCII.  Backspacing over a three byte UTF-8 character will delete the third byte, leaving corrupt UTF-8.  We know what we need to do to fix it, the problem is always competing priorities &amp;#8211; other stuff gets done first while this waits for attention.  I would very much like to get this fixed in version 2.1.x (and hence fixed in 2.2) and as you can see it is scheduled for the next point release (2.1.1) so hopefully we&apos;ll have this working soon.  If you could test the first version where we&apos;re claiming that  this is fixed, which could be a nightly build, that would be great, but 2.0.2 and even 2.1.0 simply don&apos;t have code for doing this right.  I&apos;m shooting for getting this in before the end of February: since you are watching this Jira ticket, you should see activity when it happens.&lt;/p&gt;</comment>
                            <comment id="87204" author="jan" created="Fri, 10 Feb 2012 10:12:27 +0000"  >&lt;p&gt;I have a possibly related observation: In mongo shell, when I enter a multibyte UTF-8 character and then try to delete it, what looks like a whitespace is inserted. The number of these frankenspaces is the same as the difference between byte and &quot;symbol&quot; count (so delete in a string with three &#228;&apos;s and you&apos;ll get three extra whitespaces). If I hit delete after entering &quot;&#228;&#246;&#252;&quot; my mongo shell looks like this:&lt;/p&gt;

&lt;p&gt;mongos&amp;gt; &#228;&#252;&amp;lt;space&amp;gt;&amp;lt;space&amp;gt;_&lt;/p&gt;

&lt;p&gt;where _ is the insert point now&lt;/p&gt;

&lt;p&gt;After hitting delete multiple times, the insert point will &quot;catch up&quot; again, but with careless deleting/writing, I can also generate invalid UTF-8 sequences, e.g. by entering &#228;&amp;lt;delete&amp;gt;&#252;&#246;&amp;lt;delete&amp;gt; i&apos;ll get&lt;/p&gt;

&lt;p&gt;mongos&amp;gt; &amp;lt;?&amp;gt;&#252;&amp;lt;space&amp;gt;_&lt;/p&gt;

&lt;p&gt;where &amp;lt;?&amp;gt; is the diamond-shaped black-on-white question mark character.&lt;/p&gt;

&lt;p&gt;Seems like delete deletes bytes not characters here. &lt;/p&gt;

&lt;p&gt;(Sorry if this is duplicate or in the wrong place, just seemed to fit with &quot;make the Mongo shell support Unicode properly for all input and output&quot;. I&apos;m using MongoDB shell version: 2.0.2 in GNOME-Terminal 2.30.2 with character encoding set to UTF-8.)&lt;/p&gt;</comment>
                            <comment id="77496" author="auto" created="Wed, 4 Jan 2012 13:46:48 +0000"  >&lt;p&gt;Author:&lt;/p&gt;
{u&apos;login&apos;: u&apos;&apos;, u&apos;name&apos;: u&apos;Tad Marshall&apos;, u&apos;email&apos;: u&apos;tad@10gen.com&apos;}
&lt;p&gt;Message: &lt;a href=&quot;https://jira.mongodb.org/browse/SERVER-2939&quot; title=&quot;Support Unicode fully in the Mongo shell (was &amp;quot;Linenoise UTF8 support&amp;quot;)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;SERVER-2939&quot;&gt;&lt;del&gt;SERVER-2939&lt;/del&gt;&lt;/a&gt; add several static_cast&amp;lt;unsigned char&amp;gt;() casts&lt;/p&gt;

&lt;p&gt;Prevent sign-extension of characters that have their high bit set when&lt;br/&gt;
passed to routines that take &apos;int&apos;.&lt;br/&gt;
Branch: master&lt;br/&gt;
&lt;a href=&quot;https://github.com/mongodb/mongo/commit/e27891645a2871a65d632862bc13e44fd7e83e31&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://github.com/mongodb/mongo/commit/e27891645a2871a65d632862bc13e44fd7e83e31&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="67442" author="tad" created="Thu, 17 Nov 2011 19:02:58 +0000"  >&lt;p&gt;I am interpreting this bug to be &quot;make the Mongo shell support Unicode properly for all input and output&quot;, meaning keyboard and display for all supported operating systems.  Internally, strings will be stored in UTF-8, but that isn&apos;t the actual &quot;feature&quot; from the point of view of a user.  I will link duplicate bug report to this ticket &amp;#8211; this will be the &quot;master&quot; ticket for this feature.&lt;/p&gt;</comment>
                            <comment id="61622" author="brandon" created="Thu, 20 Oct 2011 15:32:30 +0000"  >&lt;p&gt;Transferring back to Mathias who has already done some legwork on a patch.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="10517">SERVER-272</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10012">
                    <name>Related</name>
                                            <outwardlinks description="related to">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="41187">SERVER-6086</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                <customfield id="customfield_10050" key="com.atlassian.jira.toolkit:comments">
                        <customfieldname># Replies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>13.0</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10055" key="com.atlassian.jira.ext.charting:firstresponsedate">
                        <customfieldname>Date of 1st Reply</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Thu, 20 Oct 2011 15:32:30 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10052" key="com.atlassian.jira.toolkit:dayslastcommented">
                        <customfieldname>Days since reply</customfieldname>
                        <customfieldvalues>
                                        11 years, 36 weeks ago
    
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_18254" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Dependencies</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[]]></customfieldvalue>


                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_15850" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10057" key="com.atlassian.jira.toolkit:lastusercommented">
                        <customfieldname>Last comment by Customer</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>true</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10056" key="com.atlassian.jira.toolkit:lastupdaterorcommenter">
                        <customfieldname>Last commenter</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>ramon.fernandez@mongodb.com</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_11151" key="com.atlassian.jira.toolkit:LastCommentDate">
                        <customfieldname>Last public comment date</customfieldname>
                        <customfieldvalues>
                            11 years, 36 weeks ago
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10000" key="com.atlassian.jira.plugin.system.customfieldtypes:radiobuttons">
                        <customfieldname>Old_Backport</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10000"><![CDATA[No]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10051" key="com.atlassian.jira.toolkit:participants">
                        <customfieldname>Participants</customfieldname>
                        <customfieldvalues>
                                        <customfieldvalue>auto</customfieldvalue>
            <customfieldvalue>brandon</customfieldvalue>
            <customfieldvalue>jan</customfieldvalue>
            <customfieldvalue>mathias@mongodb.com</customfieldvalue>
            <customfieldvalue>tad</customfieldvalue>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_14254" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Product Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hrp1fz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_12550" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>2|hrgeqn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10558" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9297</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_23361" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Requested By</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10053" key="com.atlassian.jira.ext.charting:timeinstatus">
                        <customfieldname>Time In Status</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_22870" key="com.onresolve.jira.groovy.groovyrunner:scripted-field">
                        <customfieldname>Triagers</customfieldname>
                        <customfieldvalues>
                                

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_14350" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>serverRank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hri2av:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                    </customfields>
    </item>
</channel>
</rss>