[SERVER-2265] Map reduce failed with special character (utf-8) and a pipe ( | ) Created: 21/Dec/10 Updated: 24/Jun/13 Resolved: 24/Jun/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | JavaScript |
| Affects Version/s: | 1.6.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Sandro Munda | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
osx x86_64 - mongodb 1.6.5 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
Hi, I just want to use the map reduce function of MongoDB. So I wrote a simple map_index function in javascript : function map_index() After that, I try to load the file (index.js) into Mongo Shell and : > load('index.js'); There is no problem if I remove "| æ". var _translit = [ The same behavior when I remove "ä" and keep "æ". Of course, the file is encoded in utf-8. Full test code can be found as Attachment. |
| Comments |
| Comment by Tad Marshall [ 24/Jun/13 ] | |||||||||||||||||||||||||
|
SpiderMonkey has been replaced by V8. | |||||||||||||||||||||||||
| Comment by Tad Marshall [ 15/Aug/12 ] | |||||||||||||||||||||||||
|
It's a SpiderMonkey bug. You don't need to involve the database or map-reduce at all.
The function has corrupted the regex already. Apparently, it is using the count of Unicode characters as the byte count, so two UTF-8 characters of two bytes each cause two bytes to be lost from the end of the string.
Three 2-byte UTF-8 characters cause three bytes to be lost from the end; two UTF-8 characters loses two bytes; one UTF-8 character loses one bytes. | |||||||||||||||||||||||||
| Comment by Antoine Girbal [ 12/Oct/11 ] | |||||||||||||||||||||||||
|
It does work correctly in V8 > f function () { var re = /ä|æ/g;} should consider opening bug against SM, though it may be fixed in SM 1.8 | |||||||||||||||||||||||||
| Comment by Antoine Girbal [ 12/Oct/11 ] | |||||||||||||||||||||||||
|
Looks like the server is seeing a partial view of the expression, it's missing the closing "/g". Tue Oct 11 22:04:22 [conn35] JS Error: SyntaxError: unterminated regular expression literal nofile_b:1 ); A simpler view of the problem is: > em = function map_index(){ ); ); As you can see the 2 last characters of the regular expression have disappeared. Within a string the result is correct: Even a simple regular expression object works: But if it's within a function, bug appears: > f Looks like this is a bug with spidermonkey. | |||||||||||||||||||||||||
| Comment by Sandro Munda [ 22/Dec/10 ] | |||||||||||||||||||||||||
|
Hello, I wrote a simple testcase. You can find it in attachment. Before loading this file, you can execute this in the mongo shell : > db.test.drop() ) Thanks ! | |||||||||||||||||||||||||
| Comment by Eliot Horowitz (Inactive) [ 21/Dec/10 ] | |||||||||||||||||||||||||
|
There needs to be some certain fields, an empty object doesn't trigger it. Also, we don't know what "slug" is. | |||||||||||||||||||||||||
| Comment by Sandro Munda [ 21/Dec/10 ] | |||||||||||||||||||||||||
|
This exemple failed with all collections. Not need a specific collection. | |||||||||||||||||||||||||
| Comment by Eliot Horowitz (Inactive) [ 21/Dec/10 ] | |||||||||||||||||||||||||
|
Can you also attach some sample data? |