-
Type:
Bug
-
Resolution: Won't Fix
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: JSON
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Hi ,
I have data in such manner:
{ "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" :2}
From Mongo SHELL:
mongos> db.ascii.insert({ "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" :2}) WriteResult({ "nInserted" : 1 }) mongos> mongos> db.ascii.find() { "_id" : ObjectId("54874f34062dfda18bcb47f5"), "a" : 2, "b" : 1, "c" : 1, "d" : "\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001", "e" : 2 }
From Java driver when I try this the output is not proper:
DBCursor cursor = table.find(); while (cursor.hasNext()) { DBObject tmp = cursor.next(); System.out.println(tmp); }
The output is :
{ "_id" : { "$oid" : "54874f34062dfda18bcb47f5"} , "a" : 2.0 , "b" : 1.0 , "c" : 1.0 , "d" : "" , "e" : 2.0}
You can see no data with d .
I find that :
BasicDBObject.toString call com.mongodb.util.JSON.serialize() to serialize the object to json string. To
serialize String type in Mongo, the following method com.mongodb.util.JSON.string is
used by mongo java driver:
static void string( StringBuilder a , String s ){ a.append("\""); for(int i = 0; i < s.length(); ++i){ char c = s.charAt(i); if (c == '\\') a.append("\\\\"); else if(c == '"') a.append("\\\""); else if(c == '\n') a.append("\\n"); else if(c == '\r') a.append("\\r"); else if(c == '\t') a.append("\\t"); else if(c == '\b') a.append("\\b"); else if ( c < 32 ) continue; else a.append(c); } a.append("\""); }
From the lines:
else if ( c < 32 ) continue;
this method skip character 0-31, and do not handle control character in Unicode
format,thus u0001-u0019 character will be ignored when serialize BasicDBObject to
string.
how to handle this kind of unicode data ?
- duplicates
-
JAVA-1642 Mongo Connector does not extract valid control character from 0x00 to 0x1F." issue against T2Mongo Connector
-
- Closed
-