[JAVA-27] buffer overrun for large documents causes an exception for DBCursor.hasNext() Created: 01/Sep/09  Updated: 02/Oct/09  Resolved: 15/Sep/09

Status: Closed
Project: Java Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jason Sachs Assignee: Eliot Horowitz (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

WinXP, Java SE 6u13, Mongo Java driver mongo-0.7.jar



 Description   

I have some large documents (representing the contents of files in the 250K-1MB range) that cannot be read with the Java driver. The failure happens with DBCursor.hasNext():

public void catalog()
{
DBObject mdref = new BasicDBObject();
mdref.put("metadata", 1);
DBCursor cursor = this.coll.find();
// the above line causes an error (the following line works ok)
// DBCursor cursor = this.coll.find(new BasicDBObject(), mdref);
System.out.println("Found "cursor.count()" objects.");
while (cursor.hasNext())

{ DBObject obj = cursor.next(); System.out.println(obj); }

}

When I run this on my database collection, I get the following. The call to find() succeeds, and the cursor's count() method works ok, but it bombs on the first hasNext() call, looks like some kind of buffer overrun.

Found 10 objects.
Exception in thread "main" java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkBounds(Unknown Source)
at java.nio.DirectByteBuffer.get(Unknown Source)
at com.mongodb.ByteDecoder.decodeNext(ByteDecoder.java:189)
at com.mongodb.ByteDecoder.readObject(ByteDecoder.java:103)
at com.mongodb.DBApiLayer$SingleResult.<init>(DBApiLayer.java:538)
at com.mongodb.DBApiLayer$MyCollection.find(DBApiLayer.java:419)
at com.mongodb.DBCursor._check(DBCursor.java:214)
at com.mongodb.DBCursor._hasNext(DBCursor.java:322)
at com.mongodb.DBCursor.hasNext(DBCursor.java:347)
at com.dekaresearch.tools.datasheets.Manager.catalog(Manager.java:45)
at com.dekaresearch.tools.datasheets.Manager$Action$2.execute(Manager.java:21)
at com.dekaresearch.tools.datasheets.Manager.main(Manager.java:69)



 Comments   
Comment by Eliot Horowitz (Inactive) [ 15/Sep/09 ]

seems to be a PHP issue

Comment by Jason Sachs [ 04/Sep/09 ]

I'll try to code up a simple test case. Problem seems to be interoperation of PHP and JAVA operating on DBObjects containing binary data.

Comment by Eliot Horowitz (Inactive) [ 04/Sep/09 ]

I'm having trouble reproducing
Just added some more debugging.
can you use the latest code from github and give it a try.

Comment by Jason Sachs [ 01/Sep/09 ]

Well, I can't seem to get it to fail if I use Java to write in data, rather than PHP. The following program seems to read/write fine:

public void testbugJava27(String charsetName) throws UnknownHostException, MongoException, UnsupportedEncodingException
{
StringBuilder bigString = new StringBuilder();
for (int i = 0; i < 5000; ++i)

{ bigString.append("Line "+i+" contains 'abcdabcdefghefgh'\n"); }

String s1 = bigString.toString();

StringBuilder sb = new StringBuilder();
Mongo db = new Mongo( "127.0.0.1", "testdb" );
DBCollection coll = db.getCollection("test");
coll.drop();
String[] daysOfChristmas =

{ "a partridge in a pear tree", "two turtle doves", "three french hens", "four calling birds", "five golden rings", "six geese a-laying", "seven swans a-swimming", "eight maids a-milking", "nine ladies dancing", "ten lords a-leaping", "eleven pipers piping", "twelve drummers drumming", }

;
String[] suffixes =

{"st","nd","rd","th"}

;

for (int k = 0; k < daysOfChristmas.length; ++k)
{
DBObject obj = new BasicDBObject();
DBObject md = new BasicDBObject();
md.put("foo", 123);
md.put("bar", 456);
md.put("k", k);
obj.put("metadata", md);

sb.append(s1);
sb.append("On the "(k+1)+suffixes[k>3?3:k]" day of Christmas my true love sent to me:\n");
for (int j = k; j >= 0; --j)

{ if (j == 0 && k > 0) sb.append("and "); sb.append(daysOfChristmas[j]); sb.append(j == 0 ? ".\n" : ",\n"); }

String s = sb.toString();
byte[] data;
if ("evil".equals(charsetName))
{
data = s.getBytes();
int L = data.length;
int N = 1000;
Random r = new Random();
r.setSeed(12345);
for (int i = 0; i < N; ++i)

{ data[i] = (byte)r.nextInt(256); }

}
else

{ data = charsetName == null ? s.getBytes() : s.getBytes(charsetName); }

obj.put("data", data);
coll.save(obj);
}

DBCursor cursor = coll.find();
final int N = 400;
while (cursor.hasNext())

{ DBObject obj = cursor.next(); System.out.println(obj.get("metadata")); Object d = obj.get("data"); String s = new String((byte[])d); int l = s.length(); System.out.println("length="+l); System.out.println("last "+N+" chars = "+s.substring(l-N, l)); }

}

Comment by Jason Sachs [ 01/Sep/09 ]

Hmmm. My database seems to have UTF-8 issues so maybe this isn't a buffer overrun issue per se. (see SERVER-174). I'll try to reproduce with "plain" data.

Comment by Jason Sachs [ 01/Sep/09 ]

p.s. from the Red Herring Clarification Department:

Those lines about it succeeding if I used additional arguments to find() are because my document contains a "metadata" field (small sub-object) and a "data" field (large), so if I download only the "metadata" field it works fine.

Generated at Thu Feb 08 08:51:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.