Remove recursion from deserialize functions

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Critical - P2
    • None
    • Affects Version/s: None
    • Component/s: BSON, Performance
    • Not Needed
    • None
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Background

      The BSON deserializer in js-bson uses recursive descent for nested structures. deserializeObject() in src/parser/deserializer.ts recurses at three call sites - embedded document (0x03), array (0x04), and code-with-scope (0x0F) - with no depth tracking. A crafted BSON payload with ~10k nesting levels causes RangeError: Maximum call stack size exceeded, crashing the Node.js process.

      This is the most critical recursive path because it processes untrusted binary input from the network/queues/etc. Other recursive paths exist (serializer, calculate_size, EJSON serialize) but operate on application-constructed JS objects, making them lower-priority.

      Related tickets: NODE-7486 / HELP-90894 (security report), DRIVERS-2088 (cross-driver depth limit - backlog since 2019, no spec yet).

      Use Case

      As a user of js-bson
      I want the deserializer to use iteration instead of recursion
      So that deeply nested BSON documents don't crash the process and deserialization avoids the overhead of deep call stacks.

      User Impact

      • Eliminates process crash (DoS) from deeply nested BSON payloads
      • Potential performance improvement by removing call stack overhead per nesting level

      Dependencies

      • None

      Scope

      Deserializer - iterative rewrite (required):

      • Replace recursive deserializeObject() with an iterative implementation using an explicit stack (array of frames with push/pop)
      • Each stack frame holds the document being built, current buffer position, size, options, and a continuation describing what to do when the child completes (object assignment, array assignment + terminator check, or code-with-scope assembly)
      • No arbitrary depth limit - nesting bounded only by heap memory (~200 bytes per frame; 10k levels ≈ 2MB)
      • No public API changes

      Other recursive paths - iterative rewrite (optional, stretch goal):

      If time permits within this ticket, apply the same iterative approach to the remaining recursive paths. If not, create a follow-up ticket with full implementation details developed during the deserializer work.

      • Serializer (serializeInto) - 3 call sites recurse back into it (serializeObject, serializeCode, serializeDBRef); higher effort than deserializer
      • Size calculation (internalCalculateObjectSize / calculateElement) - simpler, only accumulates numbers
      • EJSON serialize (serializeValue / serializeDocument / serializeArray)

      Unknowns

      • Performance impact on typical documents (1–5 levels deep) - must benchmark to confirm no regression from stack frame allocation

      Acceptance Criteria

      Implementation Requirements

      • Deserializer uses explicit stack (array push/pop) instead of recursion for nested documents, arrays, and code-with-scope
      • No new public API surface - stack is purely internal

      Testing Requirements

      • All existing BSON tests pass unchanged (behavioral equivalence)
      • New unit test: BSON document with 20,000 nesting levels deserializes successfully
      • New unit test: same for nested arrays and code-with-scope
      • Round-trip test: serialize a deeply nested JS object, deserialize, compare
      • Benchmark: deserialize on typical documents shows no regression

      Documentation Requirements

      • None (no public API changes)

      Follow Up Requirements

      • If other recursive paths were not completed: create follow-up ticket with implementation details from this work

            Assignee:
            Unassigned
            Reporter:
            Neal Beeken
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: