Details
-
Bug
-
Resolution: Won't Fix
-
Minor - P4
-
None
-
None
-
None
Description
Hi ,
I have data in such manner:
{ "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" :2} |
From Mongo SHELL:
mongos> db.ascii.insert({ "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" :2})
|
WriteResult({ "nInserted" : 1 })
|
mongos>
|
mongos> db.ascii.find()
|
{ "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" : 2 }
|
From Java driver when I try this the output is not proper:
DBCursor cursor = table.find();
|
|
while (cursor.hasNext()) { |
DBObject tmp = cursor.next();
|
System.out.println(tmp);
|
}
|
The output is :
{ "_id" : { "$oid" : "54874f34062dfda18bcb47f5"} , "a" : 2.0 , "b" : 1.0 , "c" : 1.0 , "d" : "" , "e" : 2.0} |
You can see no data with d .
I find that :
BasicDBObject.toString call com.mongodb.util.JSON.serialize() to serialize the object to json string. To
serialize String type in Mongo, the following method com.mongodb.util.JSON.string is
used by mongo java driver:
static void string( StringBuilder a , String s ){ |
a.append("\""); |
for(int i = 0; i < s.length(); ++i){ |
char c = s.charAt(i); |
if (c == '\\') |
a.append("\\\\"); |
else if(c == '"') |
a.append("\\\""); |
else if(c == '\n') |
a.append("\\n"); |
else if(c == '\r') |
a.append("\\r"); |
else if(c == '\t') |
a.append("\\t"); |
else if(c == '\b') |
a.append("\\b"); |
else if ( c < 32 ) |
continue; |
else |
a.append(c);
|
}
|
a.append("\""); |
}
|
From the lines:
else if ( c < 32 ) |
continue; |
this method skip character 0-31, and do not handle control character in Unicode
format,thus u0001-u0019 character will be ignored when serialize BasicDBObject to
string.
how to handle this kind of unicode data ?
Attachments
Issue Links
- duplicates
-
JAVA-1642 Mongo Connector does not extract valid control character from 0x00 to 0x1F." issue against T2Mongo Connector
-
- Closed
-