Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: BSON, Performance
Labels:
- node-internal

Compass/DevTools Changes:
Not Needed
Confidence Status:
None

Documentation Changes Summary:

Hide

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?

Show
1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

Use Case

As a... BSON user
I want... to deserialize large BSON latin strings quickly to avoid the overhead of per byte utf translations
So that... I can speed up my application

An idea from anna.henningsen@mongodb.com: Try viewing all of a BSON document as a string, find the beginning and end of strings within the view to speed up fetching long latin strings from documents.

User Experience

What is the desired/expected outcome for the user once this ticket is implemented?
- Long strings are parsed faster

Dependencies

upstream and/or downstream requirements and timelines to bear in mind
- None

Risks/Unknowns

What could go wrong while implementing this change? (e.g., performance, inadvertent behavioral changes in adjacent functionality, existing tech debt, etc)
- Care must be taken to not misinterpret multibyte utf8 sequences
- Structurally the current deserializer may be difficult to work within given the recursive implementation. Refactors may be necessary in order to share the string view with the whole decoding process.
Is there an opportunity for better cross-driver alignment or testing in this area?
- Possibly, if the performance improves we should share the approach with others if it is possible in their language, not necessarily something for the specs.
Is there an opportunity to improve existing documentation on this subject?
- No

Acceptance Criteria

Implementation Requirements

Attempt:
- viewing BSON document bytes as a JS string
- determine the offsets of a string start and end and take slices from that string as you parse the BSON
- validate the string does not contain multibyte sequences

Testing Requirements

Check for correctness
If a performance test does not exist for long strings add one to main first

Documentation Requirements

None

Follow Up Requirements

additional tickets to file, required releases, etc
if node behavior differs/will differ from other drivers, confirm with dbx devs what standard to aim for and what plan, if any, exists to reconcile the diverging behavior moving forward
- Are there additional optimizations if the string only contains a small amount of multibyte characters?

Assignee:: Unassigned
Reporter:: Neal Beeken
Reviewers:: None
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Dec 10 2024 02:28:11 PM UTC
Updated:: Jan 28 2026 08:08:29 PM UTC

Details

Description

Use Case

User Experience

Dependencies

Risks/Unknowns

Acceptance Criteria

Implementation Requirements

Testing Requirements

Documentation Requirements

Follow Up Requirements

Attachments

Activity

People

Dates