-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
SERVER-120255 optimized bson::getField() in SBE (values/bson.h) by replacing byte-by-byte strcmp with strlen() + length check + memcmp. This improved long-field-name performance (~73% faster for 64-char names) but regressed short-field-name workloads (BF-41932) due to unnecessary strlen() calls on miss paths.
SERVER-121006 reworks the microbenchmark to be more representative (200 cases across 5 field name sizes, 4 document shapes, 5 presence patterns, 2 value types) and implements an initial hybrid approach.
This ticket covers three implementation experiments suggested during code review to further optimize the miss path, particularly for short field names.
Experiment 1: 8-byte load with zero-byte detection
Load 8 bytes from the BSON field name and use the hasZeroByte bit trick (same technique as getStringLength() in sbe/value.h) to detect whether the field name is shorter than 8 characters. If so, use a specialized fast strlen + memcmp path; otherwise fall back to the standard strlen + memcmp.
while (at least 8 bytes left)
Experiment 2: Integer arithmetic comparison (XOR + mask)
Instead of separate strlen + memcmp, do a single 8-byte load and compare using integer arithmetic:
(needle ^ haystack) & ((1 << (len * 8)) - 1) == 0
This combines length detection and comparison into one load + arithmetic sequence. Trade-off: more ALU work vs. fewer memory operations.
Important: Consider merging with BSONObj::getField() in bsonobj.cpp
For additional context please see the following PR discussion.
- is related to
-
SERVER-120255 Add more scenarios to sbe_get_field_bm, optimize bson::getField()
-
- Closed
-
-
SERVER-121006 Implement a hybrid sbe::getField() approach to balance both short & long field names
-
- Closed
-