[SERVER-5280] UTF-8 chars causes console to crash Created: 10/Mar/12 Updated: 15/Aug/12 Resolved: 12/Mar/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Tools |
| Affects Version/s: | 2.0.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Trivial - P5 |
| Reporter: | Joakim B | Assignee: | Tad Marshall |
| Resolution: | Done | Votes: | 0 |
| Labels: | charset | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 11.10: Linux 3.0.0-16-generic-pae #29-Ubuntu SMP Tue Feb 14 13:56:31 UTC 2012 i686 i686 i386 GNU/Linux |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
Hi, When I insert an document to mongo including Swedish chars and read it back using the mongo console, the console crashes saying that non-ascii characters are found. However this seems to not affect neither the REST mongo interface, nor other ways of getting the data out of mongo, but just the terminal console. Here are some data to reproduce: Sample document: { name: "åäö" }Code to reproduce: MongoDB is installed accordingly to the documentation, http://www.mongodb.org/display/DOCS/Ubuntu+and+Debian+packages Version data: #mongo --version #mongod --version I am sorry if I have misplaced this bug - first time here and couldn't really find an appropriate category for my bug. Please let me know if I can provide any other data. |
| Comments |
| Comment by Tad Marshall [ 12/Mar/12 ] | ||||||||||||||||||
|
The version of 1.8.2 in the Ubuntu repository was built without the JS_C_STRINGS_ARE_UTF8 setting for SpiderMonkey, and prints a message at startup warning you about that. The 10gen-built versions have UTF-8 enabled. Glad to hear it's working for you! Thanks for letting us know the resolution! | ||||||||||||||||||
| Comment by Joakim B [ 12/Mar/12 ] | ||||||||||||||||||
|
Sorry for posting this. | ||||||||||||||||||
| Comment by Tad Marshall [ 10/Mar/12 ] | ||||||||||||||||||
|
Hi Joakim, That error message indicates that you are running a version built without UTF-8 support. I'm not able to reproduce your problem on my system. I just followed the steps you pointed me to on the Ubuntu and Debian packages page. Running on 32-bit Ubuntu 11.10, I used Package Manager to install mongodb-10gen. I started the mongo shell, typed "db.users.insert()" and pasted your { name: "åäö" }object from the bug report. Typing db.users.find() gave me:
Can we make sure that we're running the same build? The Git version shows that we're running the same source code, but the missing bit would be a definition of the C preprocessor symbol JS_C_STRINGS_ARE_UTF8. In the mongo shell, type the command db.serverBuildInfo() and let's compare the output. I get:
If your build date in the sysInfo line is different from mine, perhaps just updating the package would fix it. If it is the same, we'll need to dig deeper. I should warn you that the 2.0.3 mongo shell doesn't handle UTF-8 correctly at the command line if you try to use command line editing. This should be fixed in version 2.1.1, but in 2.0.3 backspacing into a non-ASCII string will delete bytes from the UTF-8 characters instead of deleting the character as it should and will return corrupted (invalid) UTF-8. This affects just the shell and only when you edit the command (with, for example, backspace). If you type or paste UTF-8 and do not edit it, it should work. But this isn't the problem you are seeing. Post your result from db.serverBuildInfo() and we'll go from there, thanks! |