[JAVA-1679] Mongo cursor.next() -> org.bson.BSONException: bad string size upon fetching “hvězdnější” Created: 10/Mar/15  Updated: 16/Jul/15  Resolved: 22/Apr/15

Status: Closed
Project: Java Driver
Component/s: BSON
Affects Version/s: 2.12.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Šimon Demo?ko Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

(Copied from stack exchange on suggestion. http://stackoverflow.com/q/28950708/1920149 )

I get the exception below, when from code I try to cursor.next() and the document response should be:

{
"_id" : ObjectId("54fcc2b5664347b9be743599"),
"content" : "hvězdnější",
"languages" : {
"cs" :

{ "lemma" : "hvězdnější" }

}
}
This is the exception.

WARNING: emptying DBPortPool to localhost/127.0.0.1:27017 b/c of error
org.bson.BSONException: bad string size: 1701707776
at org.bson.BasicBSONDecoder$BSONInput.readUTF8String(BasicBSONDecoder.java:447)
at org.bson.BasicBSONDecoder.decodeElement(BasicBSONDecoder.java:155)
at org.bson.BasicBSONDecoder.decodeElement(BasicBSONDecoder.java:196)
at org.bson.BasicBSONDecoder._decode(BasicBSONDecoder.java:79)
at org.bson.BasicBSONDecoder.decode(BasicBSONDecoder.java:57)
at com.mongodb.DefaultDBDecoder.decode(DefaultDBDecoder.java:61)
at com.mongodb.Response.<init>(Response.java:83)
at com.mongodb.DBPort.go(DBPort.java:142)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._next(DBCursor.java:419)
at com.mongodb.DBCursor.next(DBCursor.java:494)
What can be done with this? when I try to fetch the same document with RoboMongo it is received correctly.

Note: Cursor did not fail with hundreds of thounds of documents, but I presume the problem with this one is in the word hvězdnější, because of its UTF-8 characters (UTF-8 also figures in stack trace)



 Comments   
Comment by Ross Lawley [ 22/Apr/15 ]

Unable to reproduce so closing.

Please comment if you provide a test case that reproduces and I'll reopen.

Comment by Ross Lawley [ 17/Mar/15 ]

Hi mimkorn,

I'm failing to reproduce on my ubuntu system, I tried with the r2.12.4 Java driver and mongodb's 2.4/2.6/3.0. Restoring the bson dump to debug.word worked as expected and had no issues outputting:

{ "_id" : { "$oid" : "54fcc2b5664347b9be743599"} , "content" : "hvězdnější" , "languages" : { "cs" : { "lemma" : "hvězdnější"}}}

My test code was:

        collection = getMongoClient().getDB("debug").getCollection("words");
 
        DBCursor cursor = collection.find();
        while(cursor.hasNext()) {
            System.out.println(cursor.next());
        }

Comment by Šimon Demo?ko [ 15/Mar/15 ]

Here is a small dump including the corrupted document: http://jmp.sh/nK93yAg

Comment by Jeffrey Yemin [ 12/Mar/15 ]

Hi,

We're not able to reproduce the issue just by copying the code that you pasted. To move this issue forward, please provide a mongodump of a minimalistic version of the collection containing just the troublesome document and a few more. UTF-8 issues can be tricky, and I don't trust the fidelity of copy and paste in this case.

Comment by Šimon Demo?ko [ 10/Mar/15 ]

Via java code. Both collection and code is confidential and I am not entitled to share those. However, I could provide a minimalistic version containing just this document or a few more and some lines of code responsible for the adding.

The document got into database via upserting from java code.

The scrap of the code is here:

      BasicDBObject language = new BasicDBObject(STRING, new BasicDBObject(STRING, "hvězdnější"));
      wordFindQuery.put("SOMESTRINGHERE", language);
            BasicDBObject wordUpdateValuePart
                    = new BasicDBObject("SOMESTRINGHERE" + "." + "SOMESTRINGHERE" + "." + SOME_INT_VALUE_HERE,
                            SOME_INT_VALUE_HERE);
            BasicDBObject updateObj = new BasicDBObject("$inc", wordUpdateValuePart);
            wordCollection.update(wordFindQuery, updateObj, true, false);

"hvězdnější" is in the first row up there. The document was not before in the database. An upsert has created it.

Comment by Jeffrey Yemin [ 10/Mar/15 ]

How did this particular document get inserted into MongoDB? I think in order to reproduce this we're going to need a way to create the exact same document, either with a small test program that inserts it or a mongodump of a collection that contains it. Can you provide either?

Generated at Thu Feb 08 08:55:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.