[SERVER-69381] Improve unique_function performance w/small object buffer Created: 01/Sep/22  Updated: 05/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Billy Donahue Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Service Arch
Participants:

 Description   

Our unique_function class template always allocates. Its 'Impl` is held by unique_ptr. We could probably benefit from a union of inline storage if the callable is small enough to fit within an aligned buffer of a few pointers' size.

https://llvm.org/doxygen/FunctionExtras_8h_source.html#l00081
If the callable size is less than 3 pointers, then there's no allocation. It's just moved into the unique_function object.
I think this optimization could be a win in async and Future dispatch costs if we bind a reasonable amount of stuff to the relevant lambdas.
Abseil is calling this absl::AnyInvocable but they use a max inline storage of 2 pointers instead of 3. https://github.com/abseil/abseil-cpp/blob/master/absl/functional/internal/any_invocable.h#L93. Their naming is standards-influenced

Or we could switch our home-grown `unique_function` out to use a vendored implementation instead (like one of those above).



 Comments   
Comment by Jason Chan [ 12/Sep/22 ]

We are looking to prioritize this because we think this is a low-hanging fruit for optimizing performance across the codebase, and this can be helpful for when we do the performance benchmarking for our upcoming refactoring projects.

Generated at Thu Feb 08 06:13:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.