[JAVA-2144] For Codec decoding, optimize code path for looking up Codec based on a BsonType Created: 07/Mar/16  Updated: 22/Dec/16  Resolved: 10/Mar/16

Status: Closed
Project: Java Driver
Component/s: BSON, Performance
Affects Version/s: 3.0.0
Fix Version/s: 3.3.0

Type: Improvement Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Jeffrey Yemin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

For Document, BsonDocument, and DBObject Codec.decode implementations, there is a code path for looking up the Codec for a value that requires two lookups in a Map; first by one in BsonTypeClassMap, then followed by one in CodecCache. This could easily be turned into a single lookup in an array that has an entry for each BsonType and a value which is the Codec to use for that BsonType.

Performance benchmarks show that the performance gains would be significant for simple documents.



 Comments   
Comment by Hakan Özler [ 22/Dec/16 ]

Hi Jeff,

Thank you for the detailed information. I have changed the code based on your advice and now it works as expected.

Comment by Jeffrey Yemin [ 22/Dec/16 ]

Hi Hakan,

The reason this started throwing an exception in 3.4 is due to the optimization introduced for this issue. To avoid repeated lookups in the CodecRegistry, DocumentCodec now caches the Codec instances required for each BsonType. Since your replacements maps to CustomDataCodec, this cache is trying to look up a Codec for the CustomDataCodec class, which was not the intention.

So basically, the gist you provided is not using 'replacements' correctly. Replacements is a map from the BsonType of the value that's stored in MongoDB to the class to which you want to decode values of that type. It's not used for decoding at all. I think what you want here is to just not set replacements at all. Then all your dates and longs will be encoded as strings, as it appears is your intention. Of course, on decoding they will be decoded as strings, as there will no longer be a way to differentiate strings, dates, and integers.

Comment by Hakan Özler [ 22/Dec/16 ]

Hi Jeff,

Thank you for the quick reply. I have made a gist that uses only the necessary codec classes of the BSON api.

Comment by Jeffrey Yemin [ 21/Dec/16 ]

Hi Hakan,

Please provide a code snippet illustrating how your application configures its CodecRegistry. The intention was for this to be non-breaking, but the optimization added for this issue does lookup Codec instances eagerly rather than lazily, so I could see how this may result in the exception that you're seeing.

Comment by Hakan Özler [ 21/Dec/16 ]

Hi Jeff,

I have a tool called mongolastic and I have decided to upgrade the current mongodb java driver (which is 3.2.0) to 3.4.1. However, after upgrading the version, the codec structure that the tool uses doesn't work well. It gives the following error:

Exception in thread "main" org.bson.codecs.configuration.CodecConfigurationException: Can't find a codec for class com.kodcu.util.codecs.CustomDateCodec.
	at org.bson.codecs.configuration.CodecCache.getOrThrow(CodecCache.java:46)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:63)
	at org.bson.codecs.configuration.ChildCodecRegistry.get(ChildCodecRegistry.java:51)
	at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
	at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
	at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:84)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:43)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:55)
	at org.bson.codecs.configuration.ChildCodecRegistry.get(ChildCodecRegistry.java:51)
	at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:80)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:43)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:55)
	at org.bson.codecs.configuration.ProvidersCodecRegistry.get(ProvidersCodecRegistry.java:37)
	at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
	at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
	at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:72)
	at com.kodcu.service.ElasticBulkService.getEncoder(ElasticBulkService.java:136)
	at com.kodcu.service.ElasticBulkService.<init>(ElasticBulkService.java:59)
	at com.kodcu.main.Mongolastic.initializeBulkService(Mongolastic.java:88)
	at com.kodcu.main.Mongolastic.proceedService(Mongolastic.java:68)
	at java.util.Optional.ifPresent(Optional.java:159)
	at com.kodcu.main.Mongolastic.start(Mongolastic.java:59)
	at com.kodcu.main.Mongolastic.main(Mongolastic.java:36)

After investigating more, I am not %100 sure but this commit to the DocumentCodec class could break the code. Or I need to change the logic about my custom codes. Plus, I haven't seen any highlighted migration details about codec on the documentation . Here is the code I get this error. Do you have any advice for this problem?

Regards,
Hakan

Comment by Jeffrey Yemin [ 23/Aug/16 ]

Hi Marc,

The implementation of CodecRegistry created via the CodecRegistries class already deals with cyclic dependencies. That's why we're not seeing this in our tests.

What's the reason for having your own implementation of CodecRegistry? Is it possible to simply use the one provided by the driver via CodecRegistries? If you're going to write your own implemenation, you will have to deal with cyclic dependencies, but the intention is that most users will use the driver's implementation.

Regards,
Jeff

Comment by Marc Knaup [ 15/Aug/16 ]

When using a DocumentCodecProvider the cycle will look like this:

at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:84)
at widgetlabs.mongodb.SimpleCodecRegistry.get(SimpleCodecRegistry.kt:30)
at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:84)
at widgetlabs.mongodb.SimpleCodecRegistry.get(SimpleCodecRegistry.kt:30)
at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:84)
at widgetlabs.mongodb.SimpleCodecRegistry.get(SimpleCodecRegistry.kt:30)
at org.bson.codecs.BsonTypeCodecMap.<init>(BsonTypeCodecMap.java:44)
at org.bson.codecs.DocumentCodec.<init>(DocumentCodec.java:86)
at org.bson.codecs.DocumentCodecProvider.get(DocumentCodecProvider.java:84)
at widgetlabs.mongodb.SimpleCodecRegistry.get(SimpleCodecRegistry.kt:30)

Comment by Marc Knaup [ 15/Aug/16 ]

This change broke nested documents (by using cyclic DocumentCodecs) for us.

A DocumentCodec is initialized with a registry. The BsonTypeCodecMap initialized by the DocumentCodec immediately asks the registry for a DocumentCodec which we cannot return since it didn't return from its constructor yet.

We can only break this initialization cycle issue with adding a layer of indirection which wraps the yet-to-be-initialized DocumentCodec.

Is this a bug or are we handing cyclic/nested document serialization wrong?

Can't BsonTypeCodecMap build it's codec array lazily instead of prefetching everything in its constructor?

Comment by Githook User [ 18/Apr/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2144: Fix @since annotation for BsonTypeCodecMap
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/7b8af2f68110cbfbe3363d471c7f7c8aa1540a17

Comment by Githook User [ 10/Mar/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2144: Optimize DBOBjectCodec via use of BsonTypeCodecMap
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/8810c0a2e379fea40b6388c8a5aad31aa32aed82

Comment by Githook User [ 10/Mar/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2144: Optimize BsonDocumentCodec via use of BsonTypeCodecMap
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/28cc882b49c7bbcbf49df63bb28fe797c4a8778c

Comment by Githook User [ 10/Mar/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2144: Optimize DocumentCodec via use of BsonTypeCodecMap
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/9024d7e4bd828de78713e6bdbe5296eeb69978b1

Comment by Githook User [ 10/Mar/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2144: Add BsonTypeCodecMap to optimize retrieval of a Codec for a BsonType
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/d15a39e58515fe1950f08d98c1d82fa3b1a13bef

Generated at Thu Feb 08 08:56:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.