[SERVER-63284] Optimize decoding from columnar format Created: 03/Feb/22  Updated: 24/Jan/23  Resolved: 24/Jan/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Ian Boros Assignee: Backlog - Query Execution
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Sprint: QE 2022-09-19, QE 2022-10-03, QE 2022-10-17
Participants:

 Description   

Decoding type and value information from the columnar format into SBE values will no doubt be a hot path. In the POC written by Mathias, one optimization involved using the case ranges gcc extension (see SERVER-63281). We should come back to this code path and take advantage of SERVER-63281 (if/when done) and make other optimizations as necessary.



 Comments   
Comment by Justin Seyster [ 18/Oct/22 ]

My early experiments showed some promising results, but I wasn't able to reproduce them reliably. I think that variations in the function layout of the output binary were causing most of the performance differences I observed. In one test, I saw a 6% difference between the two versions I tested, even though I accidentally configured the test in a way that prevented it from using the column store index at all!

To reduce the impact of i-cache effects on tests, I created a single binary that switches between the optimized and unoptimized versions of the columnar decoding function based on a feature flag. Testing on that binary did not show a significant change in performance. Based on that, I'm putting this back on the backlog with the expectation that we probably won't implement it. Time permitting, it may be worthwhile to take a few more measurements once we have more benchmarks in place.

Generated at Thu Feb 08 05:57:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.