[JAVA-2144] For Codec decoding, optimize code path for looking up Codec based on a BsonType Created: 07/Mar/16 Updated: 22/Dec/16 Resolved: 10/Mar/16 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | BSON, Performance |
| Affects Version/s: | 3.0.0 |
| Fix Version/s: | 3.3.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Jeffrey Yemin | Assignee: | Jeffrey Yemin |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
For Document, BsonDocument, and DBObject Codec.decode implementations, there is a code path for looking up the Codec for a value that requires two lookups in a Map; first by one in BsonTypeClassMap, then followed by one in CodecCache. This could easily be turned into a single lookup in an array that has an entry for each BsonType and a value which is the Codec to use for that BsonType. Performance benchmarks show that the performance gains would be significant for simple documents. |
| Comments |
| Comment by Hakan Özler [ 22/Dec/16 ] | ||||||||||||||||||||||||
|
Hi Jeff, Thank you for the detailed information. I have changed the code based on your advice and now it works as expected. | ||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 22/Dec/16 ] | ||||||||||||||||||||||||
|
Hi Hakan, The reason this started throwing an exception in 3.4 is due to the optimization introduced for this issue. To avoid repeated lookups in the CodecRegistry, DocumentCodec now caches the Codec instances required for each BsonType. Since your replacements maps to CustomDataCodec, this cache is trying to look up a Codec for the CustomDataCodec class, which was not the intention. So basically, the gist you provided is not using 'replacements' correctly. Replacements is a map from the BsonType of the value that's stored in MongoDB to the class to which you want to decode values of that type. It's not used for decoding at all. I think what you want here is to just not set replacements at all. Then all your dates and longs will be encoded as strings, as it appears is your intention. Of course, on decoding they will be decoded as strings, as there will no longer be a way to differentiate strings, dates, and integers. | ||||||||||||||||||||||||
| Comment by Hakan Özler [ 22/Dec/16 ] | ||||||||||||||||||||||||
|
Hi Jeff, Thank you for the quick reply. I have made a gist that uses only the necessary codec classes of the BSON api. | ||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 21/Dec/16 ] | ||||||||||||||||||||||||
|
Hi Hakan, Please provide a code snippet illustrating how your application configures its CodecRegistry. The intention was for this to be non-breaking, but the optimization added for this issue does lookup Codec instances eagerly rather than lazily, so I could see how this may result in the exception that you're seeing. | ||||||||||||||||||||||||
| Comment by Hakan Özler [ 21/Dec/16 ] | ||||||||||||||||||||||||
|
Hi Jeff, I have a tool called mongolastic and I have decided to upgrade the current mongodb java driver (which is 3.2.0) to 3.4.1. However, after upgrading the version, the codec structure that the tool uses doesn't work well. It gives the following error:
After investigating more, I am not %100 sure but this commit to the DocumentCodec class could break the code. Or I need to change the logic about my custom codes. Plus, I haven't seen any highlighted migration details about codec on the documentation . Here is the code I get this error. Do you have any advice for this problem? Regards, | ||||||||||||||||||||||||
| Comment by Jeffrey Yemin [ 23/Aug/16 ] | ||||||||||||||||||||||||
|
Hi Marc, The implementation of CodecRegistry created via the CodecRegistries class already deals with cyclic dependencies. That's why we're not seeing this in our tests. What's the reason for having your own implementation of CodecRegistry? Is it possible to simply use the one provided by the driver via CodecRegistries? If you're going to write your own implemenation, you will have to deal with cyclic dependencies, but the intention is that most users will use the driver's implementation. Regards, | ||||||||||||||||||||||||
| Comment by Marc Knaup [ 15/Aug/16 ] | ||||||||||||||||||||||||
|
When using a DocumentCodecProvider the cycle will look like this:
| ||||||||||||||||||||||||
| Comment by Marc Knaup [ 15/Aug/16 ] | ||||||||||||||||||||||||
|
This change broke nested documents (by using cyclic DocumentCodecs) for us. A DocumentCodec is initialized with a registry. The BsonTypeCodecMap initialized by the DocumentCodec immediately asks the registry for a DocumentCodec which we cannot return since it didn't return from its constructor yet. We can only break this initialization cycle issue with adding a layer of indirection which wraps the yet-to-be-initialized DocumentCodec. Is this a bug or are we handing cyclic/nested document serialization wrong? Can't BsonTypeCodecMap build it's codec array lazily instead of prefetching everything in its constructor? | ||||||||||||||||||||||||
| Comment by Githook User [ 18/Apr/16 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 10/Mar/16 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 10/Mar/16 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 10/Mar/16 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}Message: | ||||||||||||||||||||||||
| Comment by Githook User [ 10/Mar/16 ] | ||||||||||||||||||||||||
|
Author: {u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}Message: |