Uploaded image for project: 'Java Driver'
  1. Java Driver
  2. JAVA-5307

Address low hanging fruit in performance benchmarks

    • Type: Icon: Epic Epic
    • Resolution: Unresolved
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: None
    • Component/s: Performance
    • None
    • Performance Improvements Phase 1
    • Java Drivers
    • Hide
      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?
      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • In Progress
    • 4
    • 2
    • 0
    • 100
    • 🟢 On Track
    • Hide

      2025-03-14

      Last two weeks?

      • Refined BSON byte buffer numeric optimization changes and created a PR for review.
      • Merged codec performance improvements into the main branch.
      • Experimented with BSON read path optimizations, achieving a 25-30% improvement.
      • Created a PR for adding a Netty benchmark suite to measure the performance impact of recent changes.
      • Reviewed external performance PRs for BsonOutput improvements.

      Focus over the next two weeks?

      • Finalize writeString optimizations and create a PR for review.
      • Finalize read path optimizations and create a PR for review.
      • Introduce comprehensive test coverage to ensure there are no regressions after optimizations.

      Impediments encountered in the last two weeks?

      • While running performance tests, discovered a bug leading to a resource leak in Netty transport settings (Ticket: JAVA-5812).
      • Waiting times to run a comprehensive performance analysis on the Evergreen perf analyzer.

      2025-02-26

      Last two weeks?

      • Performance optimizations for BSON encoding and decoding to improve efficiency for both BsonArrayCodec and BsonDocumentCodec in codec lookups.
      • Added JMH into the repository, enabling local benchmarking to assess the relative performance impact of small components beyond spec benchmark tests.
      • Set up performance test environment on Evergreen spawn host to enable Linux perf CPU profiler.
      • Profiling encoding/write path to identify hotspots for optimization.
      • Experimented with byte buffer numeric optimizations. Initial experiments lead to an approximate 20-30% insert document rate improvement.
      • Experimented with byte buffer string optimizations, with experiments showing 30-90% insert rate improvement, and Deep BSON encoding achieving up to 200% improvement.

      Focus over the next two weeks?

      • Refine BSON byte buffer numeric optimization changes and create PR for review.
      • Get codec performance improvements through review and merge into the main branch.
      • Continue performance optimizations of writeString and determine next optimization steps.
      • Start profiling BSON read path to look for potential improvements. 
      • Review external performance PRs for BsonOutput improvements.
      • Consider adding a Netty benchmark suite to measure the performance impact of recent changes

      Impediments encountered in the last two weeks?

      • Discovered a discrepancy in bulk write behaviors (client bulk write and old one) while improving BsonDocumentCodec and need further clarifications.
      Show
      2025-03-14 Last two weeks? Refined BSON byte buffer numeric optimization changes and created a PR for review. Merged codec performance improvements into the main branch. Experimented with BSON read path optimizations, achieving a 25-30% improvement. Created a PR for adding a Netty benchmark suite to measure the performance impact of recent changes. Reviewed external performance PRs for BsonOutput improvements. Focus over the next two weeks? Finalize writeString optimizations and create a PR for review. Finalize read path optimizations and create a PR for review. Introduce comprehensive test coverage to ensure there are no regressions after optimizations. Impediments encountered in the last two weeks? While running performance tests, discovered a bug leading to a resource leak in Netty transport settings (Ticket: JAVA-5812 ). Waiting times to run a comprehensive performance analysis on the Evergreen perf analyzer. 2025-02-26 Last two weeks? Performance optimizations for BSON encoding and decoding to improve efficiency for both BsonArrayCodec and BsonDocumentCodec in codec lookups. Added JMH into the repository, enabling local benchmarking to assess the relative performance impact of small components beyond spec benchmark tests. Set up performance test environment on Evergreen spawn host to enable Linux perf CPU profiler. Profiling encoding/write path to identify hotspots for optimization. Experimented with byte buffer numeric optimizations. Initial experiments lead to an approximate 20-30% insert document rate improvement. Experimented with byte buffer string optimizations, with experiments showing 30-90% insert rate improvement, and Deep BSON encoding achieving up to 200% improvement. Focus over the next two weeks? Refine BSON byte buffer numeric optimization changes and create PR for review. Get codec performance improvements through review and merge into the main branch. Continue performance optimizations of writeString and determine next optimization steps. Start profiling BSON read path to look for potential improvements.  Review external performance PRs for BsonOutput improvements. Consider adding a Netty benchmark suite to measure the performance impact of recent changes Impediments encountered in the last two weeks? Discovered a discrepancy in bulk write behaviors (client bulk write and old one) while improving BsonDocumentCodec and need further clarifications.
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We should take a close look at the driver's performance benchmark results and look for opportunities to improve. As an example, some of the things observed while running the large doc bulk insert benchmark:

      • Use of java.util.Stack in BsonWriter implementations. All the methods of this class are synchronized, which is unnecessary in non-thread safe classes. Could use ArrayDeque instead.
      • Lots of calls to CodecRegistry#get in DocumentCodec#writeValue and {BsonDocumentCodec#writeValue}}. The implementation of this method in ProvidersCodecRegistry is not built for use in inner loops. Some caching within the Codec implementation could be useful here.
      • ByteBufferBsonOutput is great for minimizing heap use, but there is a performance cost of all the buffer management that it has to do in an inner loop. We can consider using a simpler implementation of OutputBuffer that trades off memory use for speed. For example, we could just cache 48MB buffers instead of power-of-two buffers.

      And some others:

      • BsonArrayCodec should use a BsonTypeCodecMap just like BsonDocumentCodec does
      • BsonArrayCodec shouldn't make a copy of its elements when decoding
      • BsonDocumentCodec shouldn't make a copy of its elements when decoding
      • BsonDocumentCodec should use its BsonTypeCodecMap for encoding, not just decoding
      • BsonDocumentCodec _id field re-ordering could be more efficient
      • Introduce valueOf methods in BsonInt32 and BsonInt64 (and use them in corresponding Codec) that are roughly equivalent to the ones in Integer and Long (which have caches for small values).

      There are likely more opportunities available as well.

            Assignee:
            slav.babanin@mongodb.com Slav Babanin
            Reporter:
            jeff.yemin@mongodb.com Jeffrey Yemin
            None
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None
              None
              None