[SERVER-44806] Move DocumentSource::getNext() implementation to the header Created: 22/Nov/19 Updated: 29/Oct/23 Resolved: 09/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework, Performance, Querying |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.3 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Charlie Swanson | Assignee: | Eric Cox (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng, qopt-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Sprint: | Query 2019-12-16 | ||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
While investigating a small performance regression we found as fallout from |
| Comments |
| Comment by Githook User [ 09/Dec/19 ] | ||||||||
|
Author: {'name': 'Eric Cox', 'username': 'ericox', 'email': 'eric.cox@mongodb.com'}Message: | ||||||||
| Comment by Eric Cox (Inactive) [ 05/Dec/19 ] | ||||||||
|
acm Yes, the getNext function isn't always inlined. I pulled the binaries from evergreen for cases 1, 2, and 3 above. For case 1 (baseline): The obvious. getNext() isn't inlined so disassembling shows calls to getNext(). Here are baseline perf numbers relative to https://evergreen.mongodb.com/version/performance_58abcf6577982367232a6c76d1ee277a4031ed3c
For case 2: There are 45/66 calls where getNext() was not inlined when simply moving getNext() to the header. But where the compiler can inline at various call-sites it is able to do so. This can explain the perf number I posted above in the above comment.
For case 3:
Full inlining with `attribute(alwas_inline)` gives a few percent over moving getNext() to the header, it could be noise. charlie.swanson let me know if you need further investigation. | ||||||||
| Comment by Eric Cox (Inactive) [ 04/Dec/19 ] | ||||||||
|
acm As discussed on slack, Charlie and James are onboard with investigating further. I'll report back with performance results for diffs that have: 1) A baseline (no changes) | ||||||||
| Comment by Andrew Morrow (Inactive) [ 04/Dec/19 ] | ||||||||
|
eric.cox - Can't say that really convinces me. The function doesn't appear to actually get inlined, and I'm not surprised given its contents. Did you pull the mongod binaries for the two builds and compare the generated code and validate that getNext was actually inlined into important callsites when you moved the definition to the header? If not, I think it is unwise to just proceed without a good understanding of why moving the function to the header had this effect, despite the function not actually being inlined. You could find that your performance gains are ephemeral - a consequence of codegen peculiarities that are in play now, and that might cease to be in play in the future. At minimum, I'd suggest doing an additional experiment where you apply attribute(always_inline) to the definition in the header, verify that the function does really get inlined with that attribute, and that you observe the same performance deltas. | ||||||||
| Comment by Eric Cox (Inactive) [ 04/Dec/19 ] | ||||||||
|
acm We took a closer look at the performance around inlining getNext() by moving the code to the header. Here are some attached screenshots of the performance results.
There seems to be a clear advantage to this approach to support the change so we are going to move forward with it. | ||||||||
| Comment by Eric Cox (Inactive) [ 03/Dec/19 ] | ||||||||
|
Hey charlie.swanson, I started looking at this and was wondering if you did any further perf tests to verify Andrew's comment above. If so, can we chat briefly about the perf investigation? | ||||||||
| Comment by Andrew Morrow (Inactive) [ 25/Nov/19 ] | ||||||||
|
charlie.swanson - I was curious about this. After taking a look at DocumentSource::getNext, I was skeptical that it would actually be inlined. So I built two mongod binaries, one where it was, and one where it wasn't. In neither mongod binary did I find that DocumentSource::getNext was actually inlined. What I did observe was that in the case where DocumentSource::getNext was moved to the header and tagged inline, the emitted definition was not hot/cold partitioned, but in the case where it was defined out of line, it was.
I think it is worth taking a closer look at this before simply moving it to the header. The performance consequences do not appear to be attributable to the function being inlined. |