[SERVER-3538] UTF8 null character \u0000 in the middle of a string is not handled correctly Created: 05/Aug/11 Updated: 05/Jan/14 Resolved: 05/Aug/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 1.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Charles-Henri d'Adhémar | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Suse Linux Entreprise Server x86_64 10.3 |
||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Hello, The valid UTF8 character \u0000 is not handled properly : the string is cut at this character. Mongo is probably interpreting it as a string terminating character. MongoDB shell version: 1.8.0 ) > db.test.findOne( {text: /foo/}) { "_id" : ObjectId("4e3bbbaa4e496a38200a6f81"), "text" : "foo" }> db.test.findOne( {text: /bar/}) We use Mongo to log errors from various servers. We do not have any control on the string characters incoming and we have no workaround for this issue so far. Cheers, |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 05/Aug/11 ] |
|
See |
| Comment by Charles-Henri d'Adhémar [ 05/Aug/11 ] |
|
Here are some more information : In Python this case is handle correctly : In [1]: import re In [2]: text = u'foo\u0000bar' In [3]: re.search('foo', text) In [4]: re.search('bar', text) In production we use the pymongo driver : the string 'foo\u0000bar' is correctly saved in the DB and correctly retrieved by either the pymongo API or the interactive javascript shell. But a regex search on words after the '\u0000' character fails in either pymongo or interactive shell. The issue might come from several places : PCRE lib issue ? The issue is not as simple as "in UTF8 the \u0000 is the string terminating character so this is working as designed". Do not hesitate to ask for more information. |