-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
Summary
We use configuration strings for flexibility in many parts of the WiredTiger API. The APIs that use it were not really intended to be in performance critical paths, but sometimes that has occurred. Our current configuration parsing routines potentially examine the entire configuration string multiple times.
This ticket tracks a new approach to "pre-compile" a configuration string. There can be some performance benefits without changing the calling code, and larger speedups are available by using new APIs to precompile any configuration string that will be used multiple times.
Motivation
This approach will improve the performance of WT for certain workloads. Also, tickets like WT-11105 and HELP-28951 show that this may be a particular problem for the Graviton ARM processor. Our approach in the past has been mitigation:
- WiredTiger functions that get configuration strings often use a _def variant that passes the default value, allowing a shortcut in many cases. This makes the code slightly less readable and duplicates the initial value, which can also be found in the default configuration string.
- There are several times where internal code has been rewritten (
WT-8365) to make an existing API faster - We have added a new API (
WT-8366) that is a faster version of the original API.
A previous attempt was made using precompiling in WT-8571. While that approach demonstrated good performance, it had some flaws. The chief one was that the code changes in the "main-line" WiredTiger code that used it were somewhat intrusive. It also didn't have the flexibility to handle internal functions that parsed configuration strings for multiple APIs. The new approach borrows a lot from that initial effort, and has corrected the design flaws.
Plan
The plan for this ticket is to incorporate the infrastructure changes needed to support precompiling broadly, but only use it for the WT_SESSION::begin_transaction API. The configuration parsing done by this API has been identified in WT-11105 as responsible for 9% of the time spent in the 100 read YCSB benchmark.
Using it for one API allows us to try out the approach with a minimum of disruption to the WiredTiger code base. Assuming it is successful, changes can be made to other APIs as we deem them to be beneficial.
The change for WT_SESSION::begin_transaction on its own will only give a modest speedup for MongoDB. To get the full benefit, some changes will be needed at the MongoDB layer to precompile and then use the configuration string used with begin_transaction. That will need to be a separate SERVER ticket.
- causes
-
WT-12406 Coverity analysis defect 138679: Dereference after null check
- Closed
-
WT-12407 Fix memory leak in configuration compilation
- Closed
-
WT-12408 Coverity analysis defect 138678: Unintentional integer overflow
- Closed
- is depended on by
-
SERVER-85527 Use compiled configuration strings for WT_SESSION::begin_transaction calls
- Closed
-
WT-12298 Avoid config string parsing when setting cursor bounds
- Closed
- is related to
-
WT-11105 Make a version of WT_SESSION::begin_transaction that takes a struct instead of config string
- Closed
-
WT-12389 Investigate returning a "real" string from compile_configuration
- Open
-
WT-12150 Adjunct design ideas for configuration string processing
- Closed