-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
Server Programmability
Right now the BSON parsing library assumes that bson is well-formed. This is generally valid because we do validation when accepting messages over the network, our disk images are crc32c-checked, and all bson we generate internally should be inherently well formed. That said, we would benefit from some defense in depth to limit the damage if ill-formed BSON makes its way into a BSONObj or similar types. Here is a non-exhaustive list of things we can do to improve this:
First the things that should have little to no perf impact:
- All failed checks should be {{tassert}}s rather than {{uassert}}s or {{invariant}}s. It is supposed to be impossible to get invalid BSON into BSONObj and friends, so it is definitely a programmer error if it happens. So we want to know about it if it happens in tests (except tests specifically testing how they handle invalid bson), but we also don't want it to bring down the whole server if it happens.
- All BSON iteration knows where the end of the object is. Before handing out an element, we should check its size and ensure that it ends before the end of the object. We should remember the size that we don't need to recompute it when advancing.
- The hand-rolled iteration in getField() should either use a normal iterator or do the same bounds check before returning.
- We should change the BSONObjStlIterator type to be a C++20-range style range rather than an older style. This gives us two benefits:
- end() is allowed to return a different type (a sentinel), so operator== knows when it is doing an end check, and can do >= like BSONObj, rather than ==. This ensures that if we somehow end up past the end we exit the loop rather than continuing forever. Alternatively, we could throw in the > case, since something is very wrong if we get there.
- operator* doesn't need to return a `T&`, so we can return a BSONElement by value rather than storing one as a member. This gives us more flexibility in how we represent the data in the iterator.
- The computation of BSONElement size is also responsible for validating the element type. It needs to validate all bits of the type, not just the low 5 bits. One option (not the only one) is to use a 256-byte static array, rather than a 32-byte one.
These will add additional computation compared to what we do today (rather than just moving around where we do them), so they may have a perf impact. We should check if they are tolerable rather than assuming that they won't be:
- firstElement() should do a bounds check prior to returning the element.
- I think between this, the iterators, and getField, we will cover all ways to get an element out of an object. But if there are any more, they should also be size-checked.
- We used to validate in the BSONObj constructor that took a pointer that the object ended with an EOO (0x00) byte (ie objdata()[objsize()] == 0). We should do that again, because some logic relies on that. It also provides assurance that any {{strlen within the object will terminate before going out of bounds.
- The constructor from a ConstSharedBuffer should do that, as well as ensuring that the object size isn't larger than the buffer's capacity.
- shareOwnershipWith() should check that the object is completely contained in the shared buffer.
- Right now, some of getters on BSONElement do not check that the type is what it is claimed to be. The assumption is that the caller has already validated this. We should explore if the perf of actually checking (in a tassert) would be prohibitive, to catch programmer errors. It may end up being free since most calls should be immediately after a type check anyway. We could consider making private unchecked methods that are only used internally by BSONElement, where it is easy to audit that they are being used correctly.
- If we end up keeping the unchecked methods, we should rename them to include Unsafe or Unchecked to visibly warn about their use at the call sites. This will also force us to audit all current users to ensure that they actually do check.
- Validate that variable-sized elements never have negative sizes.
And here are a few ideas that don't directly improve safety, but may improve the performance to reduce the impact of the extra checking, potentially making more checking possible:
- Explore storing the element size and/or type in BSONElement.
- Whatever we do, we must ensure that on a 64-bit platform it doesn't become larger than 16 bytes, otherwise it won't be passed or returned in registers which negates any potential improvement here. I suspect that that means we can only put size or type in the element and I suspect that type will be better, but we should measure to be sure.
- We need to load the type in order to compute the size. And basically any usage of BSONElement will also check the type. This includes the methods that check that the type is as expected (which we want more of). If we put it in the element, it can get it from the registers without reloading through the pointer. Also it should be easier for the compiler to see that the BSONElement is unaliased/unmodified than the pointed-to buffer.
- Putting the size in the element avoids needing to recompute it again. It also would let things like valueStringData() avoid putting loading through the pointer when returning a StringData.
- Consider marking the pointers in BSONObj and BSONElement with _restrict_ and/or attributes that inform the compiler that they are always non-null (since they point to static data rather than null when empty).
- _fieldNameSize should be either 1 higher or lower. Right now it is the size including the NUL byte. I think this is basically never what we want. When returning fieldNameStringData we need to subtract 1. And when computing the start address of value() we need to add add one more to account for the type. If we choose to add 1 (which I suspect is better), we should rename the member _valueOffset.
Note: this ticket is only focused on spatial memory safety, ie not reading outside of the bounds of the bson object. Temporal memory safety, ie not accessing already freed memory through a BSON view, is a separate issue that will be tackled differently.