[DRIVERS-2154] BSON corpus: another test case for subdocuments that are too short Created: 23/Aug/16  Updated: 31/Mar/22

Status: Backlog
Project: Drivers
Component/s: BSON
Fix Version/s: None

Type: Spec Change Priority: Minor - P4
Reporter: Luke Lovett Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Driver Changes: Needed

 Description   

This applies to the BSON Corpus spec: https://github.com/mongodb/specifications/blob/master/source/bson-corpus/bson-corpus.rst

The folllowing test case unveiled a bug in PyMongo's array decoding: https://github.com/mongodb/specifications/blob/master/source/bson-corpus/tests/array.json#L34-L37.

PyMongo validates the array's size by looking at the int32 preceding the array, grabbing the byte at that offset from the current position, then checking to see if it's 0x00: https://github.com/mongodb/mongo-python-driver/blob/3.3.0/bson/__init__.py#L158-L161

The interesting thing about the above test case is that the byte at that position is 0x00. PyMongo happily accepts such an array without raising an error, even though the 0x00 byte is part of the field value, not the terminator for the array.

PyMongo has the same problem when decoding sub-documents, but the BSON corpus tests don't have a case for this. I'm proposing that we add one to document.json like this:

{
  "description": "Subdocument too short, but terminator looks like EOO.",
  "bson": ""140000000361000b0000001062000a0000000000"
}

For reference, the above BSON is the following document, with the length of subdocument a decreased by one:

{'a': {'b': 10}}


Generated at Thu Feb 08 08:24:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.