[SERVER-83980] Investigate using callback gRPC API Created: 07/Dec/23 Updated: 05/Jan/24 Resolved: 11/Dec/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Erin McNulty | Assignee: | Erin McNulty |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Sprint: | Service Arch 2023-12-11 |
| Participants: |
| Description |
|
When doing initial performance testing in After looking around, we saw on the gRPC performance page that using the sync server is not suggested for performance-sensitive applications. gRPC instead provides a callback API, that is supposed to be easier to use than the async API but faster than the sync API. Investigate the changes required to implement the callback api instead of the sync api, and run initial performance benchmarks on them. |
| Comments |
| Comment by Erin McNulty [ 11/Dec/23 ] | ||||||||||||||||||||||||
|
Update: it was easy to change the threading model using the method I described above TLDR is that we did not see significant improvements for gRPC with the callback API, but I would caution against using this data to justify ignoring the callback API for future investigations, because this was a very rough POC without any optimizations implemented. The results are summarized here:
As seen above, the callback API was not faster in initial performance runs with respect to MongoRPC on 1 or 4 threads-- it is around 2x slower on one thread for both implementations, and around 3-4x slower with 4 threads, with the sync API actually winning out by a bit. The more detailed results are here. POC branch is here, focus on the changes in the grpc/* folder. I am putting this down in order to focus on completing the correctness work for this project, but I think the next steps for PM-3366 are to:
| ||||||||||||||||||||||||
| Comment by Erin McNulty [ 07/Dec/23 ] | ||||||||||||||||||||||||
|
I was able to make changes so that gRPC was using the callback API, but I think that changing the threading model is not as simple as we thought. I ran into a segfault when connecting to the server, and the backtrace led me to the spot in session_workflow right before we call source message. I also saw that right before this message, we switched threads. When I tried out the inline model with this, just to see what would happen, it locked up the entire server and I had to kill it. TLDR is that its easy enough to switch to the callback API in terms of our direct gRPC code, but the threading model might not be so simple. I think my next step might be to try to give each gRPC stream its own thread on the handleStream level, and then still use kInline when we enter into the session workflow. All of my changes are here (this is from all of my perf testing, so only focus on the changes in the grpc/* folder if you are interested in looking). |