[SERVER-42034] Remove numStr from BSONObjBuilder Created: 01/Jul/19 Updated: 29/Oct/23 Resolved: 29/Jul/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | ADAM Martin (Inactive) | Assignee: | Billy Donahue |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Dev Tools 2019-07-15, Dev Tools 2019-07-29 | ||||||||
| Participants: | |||||||||
| Comments |
| Comment by Githook User [ 29/Jul/19 ] |
|
Author: {'name': 'Billy Donahue', 'username': 'BillyDonahue', 'email': 'billy.donahue@mongodb.com'}Message: (~13% faster, remove the 1kLoC static table) |
| Comment by Githook User [ 22/Jul/19 ] |
|
Author: {'name': 'Ian Boros', 'username': 'puppyofkosh', 'email': 'puppyofkosh@gmail.com'}Message: Revert " This reverts commit be18bd1133d6f8118ebb79a2251c4a49b1c84ec1. |
| Comment by Githook User [ 22/Jul/19 ] |
|
Author: {'name': 'Billy Donahue', 'username': 'BillyDonahue', 'email': 'billy.donahue@mongodb.com'}Message: |
| Comment by Githook User [ 09/Jul/19 ] |
|
Author: {'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}Message: Callers are better with either DecimalCounter or std::to_string. |
| Comment by Billy Donahue [ 03/Jul/19 ] |
|
I LGTMed the CR to implement this change because I see several stylistic, performance, and even correctness reasons to get rid of these static strings. I doubt they were providing a realistic performance benefit, but that's not the most important thing. The most pressing issue to me is the undefined init-time behavior of the numStr() function, relying on assumptions about the C++ implementation that don't hold up to close scrutiny. Specifically, the early read of numStrsReady in BSONObjBuilder::numStr() is undefined behavior invoked on purpose in an effort to protect us from other undefined behavior. The comment above numStrsReady's definition is untrue. Reading it early will yield undefined behavior, not a false value. Would an implementation be allowed to "prove" to itself that numStrsReady is always true, and just static-initialize it as an optimization? As a template, all of std::basic_string is visible to the compiler, so I think it's possible that numStrsReady offers no protection at all from early accesses to the numStrs array, and it's just a waste of CPU time to look at it. But it's a non-const static duration bool that's initialized to true, but it isn't guaranteed to remain true, so numStr() would still have to waste time looking at it. Even if this static-init optimization isn't happening, we would still have a race. The initialization of the numStrs array is not ordered wrt potentially early callers of the inline function BSONObjBuilder::numStr(). We are not guaranteed any order at all between definitions in different translation units and we are not guaranteed by C++ that pre-main() is single-threaded. BSONObjBuilder::numStr can be invoked by a static initializer anywhere. It is even allowed to run concurrently with the initialization of numStrsReady or numStrs, and there's no synchronization provided by the numStr function, so it could be read while being updated. There are other fixes available for these problems, but I think the simplest is to just remove the static strings altogether. Performance-wise, I think the 0-99 numStrs table will make short arrays quicker to write, but long arrays slower to write. But the benefits for short arrays would be swamped by the other processing that happens while preparing the enclosing object. That is, I think we'd be optimizing a drop in the bucket for short arrays. But for long arrays, we're making the whole bucket slower. We should just rip BSONObjBuilder::numStr out. The uses internal to BSONObjBuilder are better served by DecimalCounter. https://docs.google.com/spreadsheets/d/1eOsD8EPCmVTLLfl3_HRG1yYrw2I79oJiDJRb_BdCyks/edit?usp=sharing
The 20 or so uses outside BSONObjBuilder have nothing to do with BSONObj. They usually are incrementing a field name and should use DecimalCounter as well.
|
| Comment by Andrew Morrow (Inactive) [ 02/Jul/19 ] |
|
I don't think we should do it. I suggest we close this ticket. |
| Comment by Eric Milkie [ 02/Jul/19 ] |
|
What is the reasoning behind doing this work now – what is blocked behind it? |