[SERVER-6244] Mongo shell in Windows doesn't display Unicode when console is set to Terminal font Created: 28/Jun/12  Updated: 05/Nov/15  Resolved: 29/Jun/12

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: 2.1.2
Fix Version/s: 2.2.0-rc0

Type: Bug Priority: Major - P3
Reporter: Sridhar Nanjundeswaran Assignee: Tad Marshall
Resolution: Done Votes: 0
Labels: Windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

mongodb server 2.1.2 2008+ build on windows 7 x64


Attachments: File tweets.bson    
Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: Windows
Participants:

 Description   

New description:
When the Windows console is set to use the "Terminal" font (a non-Unicode font), attempts to display (non-ASCII) Unicode text in the Mongo shell instead display nothing. The WriteConsoleW() Windows API returns FALSE and GetLastError() gives 31 as the error code – "A device attached to the system is not functioning.". Changing the console font to "Consolas" or "Lucida Console" makes Unicode output work again.

mongorestore the attached bson document. from the shell do a db.<coll>.find(). Even though the document exists and returns as part of the cursor, it does not display in the shell
> db.jira.count()
1
> var cursor = db.jira.find()
> var ret = cursor.next()
> ret._id
ObjectId("4feccef2f19b9ad092ad9e5b")

Note this issue does not exist in 2.1.1. It seems to be a problem with 2.1.2 and today's build.

Due to this findOne does not return anything if this is the first document found



 Comments   
Comment by Tad Marshall [ 29/Jun/12 ]

Documentation should mention that the shell can only display Unicode text correctly in Windows when the console is set to use a Unicode font. This is set through the Properties dialog for the console window, on the Font tab. Unicode fonts available in Windows 7 are "Consolas" and "Lucida Console". The "Raster Fonts" setting selects an OEM (non-Unicode) font, and Unicode (non-ASCII) characters will not display correctly with this font.

Comment by auto [ 29/Jun/12 ]

Author:

{u'date': u'2012-06-29T12:43:22-07:00', u'email': u'tad@10gen.com', u'name': u'Tad Marshall'}

Message: SERVER-6244 behave better when non-Unicode font used with shell

If the Windows WriteConsoleW() API fails because the console is set
to a non-Unicode font, display a (hopefully helpful) error message
once and then display the text anyway. ASCII parts will display
fine, non-ASCII parts will display with UTF-8 bytes interpreted as
if they were OEM characters.
Branch: master
https://github.com/mongodb/mongo/commit/181ff5d768e45bb4b84622444da9fa3e0181aaa8

Comment by Sridhar Nanjundeswaran [ 29/Jun/12 ]

Confirmed that switching fonts displays this. But the 2.1.1 shell works if the font is Raster Fonts.

Comment by Tad Marshall [ 29/Jun/12 ]

It's the font that is selected in the console window that determines whether it works or not. If the selected font is "Consolas" or "Lucida Console" then everything works. If the selected font is "Terminal" (aka "Raster Fonts") then it fails as described above. I need to see if we can display at least the available characters (Terminal is an "ANSI", i.e. non-Unicode, font) in the Terminal font, or convert them to some substitution character, or complain/warn the user.

Comment by Tad Marshall [ 29/Jun/12 ]

I tested this at home with a recent build and I was able to reproduce it. The problem seemed to be with display of Unicode text ... if I displayed the fields one-by-one, only the ones that included Unicode text failed to display. Then I tried it on a new clean build and was unable to reproduce the problem: all fields, including Unicode text fields, displayed fine.

Testing at work on a downloaded version from the download page, I can't reproduce it. Very odd.

The problem I saw was consistent with a failure of MultiByteToWideChar; we convert the entire output string in one shot and if the conversion fails, we print nothing. A failure of WriteConsoleW() might do the same thing ... display nothing.

A field-by-field loop displays everything that doesn't include non-ASCII characters, and a character-by-character display of a string that includes non-ASCII characters works on every character except the non-ASCII ones. In my test that showed the problem, I could display every character in the "text" field except the รก (U+00E1, decimal 225) at index 51 in the string. In my retest now, that character displays fine.

Sridhar, can you try on other machines or VMs and see if it behaves like this for you?

Generated at Thu Feb 08 03:11:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.