-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Critical - P2
-
None
-
Affects Version/s: None
-
Component/s: BSON, Performance
Background
The BSON deserializer in js-bson uses recursive descent for nested structures. deserializeObject() in src/parser/deserializer.ts recurses at three call sites - embedded document (0x03), array (0x04), and code-with-scope (0x0F) - with no depth tracking. A crafted BSON payload with ~10k nesting levels causes RangeError: Maximum call stack size exceeded, crashing the Node.js process.
This is the most critical recursive path because it processes untrusted binary input from the network/queues/etc. Other recursive paths exist (serializer, calculate_size, EJSON serialize) but operate on application-constructed JS objects, making them lower-priority.
Related tickets: NODE-7486 / HELP-90894 (security report), DRIVERS-2088 (cross-driver depth limit - backlog since 2019, no spec yet).
Use Case
As a user of js-bson
I want the deserializer to use iteration instead of recursion
So that deeply nested BSON documents don't crash the process and deserialization avoids the overhead of deep call stacks.
User Impact
- Eliminates process crash (DoS) from deeply nested BSON payloads
- Potential performance improvement by removing call stack overhead per nesting level
Dependencies
- None
Scope
Deserializer - iterative rewrite (required):
- Replace recursive deserializeObject() with an iterative implementation using an explicit stack (array of frames with push/pop)
- Each stack frame holds the document being built, current buffer position, size, options, and a continuation describing what to do when the child completes (object assignment, array assignment + terminator check, or code-with-scope assembly)
- No arbitrary depth limit - nesting bounded only by heap memory (~200 bytes per frame; 10k levels ≈ 2MB)
- No public API changes
Other recursive paths - iterative rewrite (optional, stretch goal):
If time permits within this ticket, apply the same iterative approach to the remaining recursive paths. If not, create a follow-up ticket with full implementation details developed during the deserializer work.
- Serializer (serializeInto) - 3 call sites recurse back into it (serializeObject, serializeCode, serializeDBRef); higher effort than deserializer
- Size calculation (internalCalculateObjectSize / calculateElement) - simpler, only accumulates numbers
- EJSON serialize (serializeValue / serializeDocument / serializeArray)
Unknowns
- Performance impact on typical documents (1–5 levels deep) - must benchmark to confirm no regression from stack frame allocation
Acceptance Criteria
Implementation Requirements
- Deserializer uses explicit stack (array push/pop) instead of recursion for nested documents, arrays, and code-with-scope
- No new public API surface - stack is purely internal
Testing Requirements
- All existing BSON tests pass unchanged (behavioral equivalence)
- New unit test: BSON document with 20,000 nesting levels deserializes successfully
- New unit test: same for nested arrays and code-with-scope
- Round-trip test: serialize a deeply nested JS object, deserialize, compare
- Benchmark: deserialize on typical documents shows no regression
Documentation Requirements
- None (no public API changes)
Follow Up Requirements
- If other recursive paths were not completed: create follow-up ticket with implementation details from this work