[JAVA-194] Improved Performance of BSONDecoder Created: 21/Oct/10  Updated: 25/Jun/13  Resolved: 25/Jun/13

Status: Closed
Project: Java Driver
Component/s: Performance
Affects Version/s: 2.2
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Hans Meiser Assignee: Unassigned
Resolution: Done Votes: 5
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends

 Description   

Hi,

my commit

http://github.com/theunique/mongo-java-driver/commit/8b0ca1ba9f53285b5a3084e233b116d9228bc277

contains a path to improve performance of BSONDecoder. Due to thread

http://groups.google.com/group/mongodb-dev/browse_thread/thread/74fa538f4281a2be

the improvement is up to 40%

ciao.hans.



 Comments   
Comment by Jeffrey Yemin [ 25/Jun/13 ]

The BSON decoder has been re-written several times since this was opened. Please open a new issue if you are still seeing problems in the latest driver.

Comment by Lucas POUZAC [ 17/Mar/11 ]

Hi,

When will this correction is expected? 2.6?

Comment by Eliot Horowitz (Inactive) [ 20/Dec/10 ]

We've made a number of the changes for 2.4, but there are still more which is why i haven't moved this case.

Comment by Jeff Yemin (Inactive) [ 20/Dec/10 ]

Is this now for 2.4, or is it still unscheduled?

Comment by Eliot Horowitz (Inactive) [ 13/Dec/10 ]

Correct

Comment by Hans Meiser [ 13/Dec/10 ]

'master' is your current version in github?

Comment by Eliot Horowitz (Inactive) [ 09/Dec/10 ]

A lot of the changes are in 2.4
A number of edge cases had to be tweaked, but can you test master vs 2.3?

Comment by Hans Meiser [ 09/Dec/10 ]

Any news on that issue?

Comment by Eliot Horowitz (Inactive) [ 22/Oct/10 ]

Stil want to try to see what's actually making the difference.
IN my profiling, I think the big differences is reading -> bytes vs. readingto char.
going to try merge something like that in.

Comment by Hans Meiser [ 22/Oct/10 ]

My BSONDecoder didn't support surrogates. Now it does.
BSONDecoder now does a simple caching of short strings (inspired by your one-byte-strings)

I also removed my previous commit and committed a new version without formatting it.

http://github.com/theunique/mongo-java-driver/commit/ab29686a2f152556bf2d46f8506bfd216597444e

Comment by Hans Meiser [ 21/Oct/10 ]

I have a version without formatting it with the eclipse formatter from your github. Would that help?

Comment by Hans Meiser [ 21/Oct/10 ]

Some of the performance improvements came from utf-8 but the improvements come from readahead. For example no allocation of byte[4] for reading simple int.
And most come from tied inner loop.

Sorry, I previously had a fork with all the commits and erased it to give you one commit to simplify pull request

PS: Your isAscii is faster if you do

isAscii = 0
...
while()
{
b = ...
isAscii |= b;
...
}

if(isAscii&0x80!=0)
{
no
}
else
{
ascii
}

Comment by auto [ 21/Oct/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: removed some UTF-8 conversions by checking values JAVA-194
http://github.com/mongodb/mongo-java-driver/commit/306e36c148c19447a03c81a9d93c4e947d97feb0

Comment by Eliot Horowitz (Inactive) [ 21/Oct/10 ]

Reading the diff I think the main reason you see a speed bump is because you're not doing some utf8 encoding in some places.
The problem is that we need to, as the unit tests don't pass with your change.

Try "ant test" to see some failures.

There are a lot of diffs in that commit.
Some seem to by stylistic formatting, and there are a few other concepts.

very hard to see what's going on exactly.

will try to see if there is anything we can use, but breaking it up into smaller commits, and making sure style/formatting doesn't change would be very helpful

Generated at Thu Feb 08 08:51:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.