[SERVER-84972] Investigate if pipeline-style update performance got better for IncFewLargeDocLongFields Created: 06/Jun/19 Updated: 12/Jan/24 Resolved: 17/Sep/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | James Wahlin | Assignee: | Ruoxin Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Sprint: | Query 2019-06-17, Query 2019-07-01, Query 2019-07-29, Query 2019-08-12, Query 2019-08-26, Query 2019-09-09, Query 2019-10-07, Query 2019-12-16, Query 2020-09-21 | ||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
See some comments and previous description for context, we're wondering if the new strategy for logging these updates will help performance on this workload.
The focus for this ticket will be profiling of the IncFewLargeDocLongFields perf test. There was a 2x-5x slowdown for this test as compared to the baseline across all variants. |
| Comments |
| Comment by Ian Boros [ 16/Sep/20 ] |
|
ruoxin.xu I'd say you're good to close this. |
| Comment by Ruoxin Xu [ 16/Sep/20 ] |
|
This microbenchmark was running an incorrect query. It's now been modified and renamed to 'IncrementFewKeysLargeDocLongFields' after PERF-2028. Generating $v:2 oplog entries didn't seem to have obvious regression on the new microbenchmark. ian.boros Do you think we need other further investigation on this ticket? |
| Comment by Ian Boros [ 03/Sep/20 ] |
|
This workload has a bug caused by a typo in the update pipeline. Instead of incrementing fields, the update is making the document one level deeper each time. Eventually, the updates start failing with a "depth exceeded" error and taking a slow uasserted()/exception code path. I don't think we should read too much into changes in performance until this problem is fixed. See PERF-2028. |
| Comment by Charlie Swanson [ 18/Dec/19 ] |
|
Re-titled this ticket to better reflect current reality after discussing with ian.boros. Throwing this back on the backlog for now. |
| Comment by James Wahlin [ 23/Oct/19 ] |
|
david.storch - it is definitely possible that SERVER-41114 would improve performance and is worth a try. This ticket was filed to address to explore slowness in pipeline-based updates when compared to non-pipeline-based updates. The IncFewLargeDocLongFields microbenchmark was implemented for both and is an example of a scenario that pipeline-based updates will not be performant on since we piggy-back on the replacement update mechanism. This is not actually a regression since we have not replaced the $inc update operator with pipeline updates. I will retitle this ticket to reflect. |
| Comment by David Storch [ 22/Oct/19 ] |
|
james.wahlin, is it possible that SERVER-41114 would improve performance? If that's at least plausible, then we could try running a perf patch build with justin.seyster's draft changes for SERVER-41114. Can you also clarify whether there was a performance regression? Or were you making an observation about how pipeline-based updates are slow compared to the baseline set by regular old-fashioned non-pipeline-based updates? |
| Comment by James Wahlin [ 21/Oct/19 ] |
|
I profiled this and found a significant amount of time going through a BSON -> Document/Value -> mutablebson -> BSON transformation for pipeline updates. I was curious whether the work being done for Document/Value at the time help make this faster. |
| Comment by David Storch [ 19/Oct/19 ] |
|
james.wahlin, I'm not aware of Martin ever looking into this. Can you elaborate on your understanding of the source of the slowness? |
| Comment by James Wahlin [ 18/Oct/19 ] |
|
david.storch - if martin.neupauer did not find anything Document/Value related that would improve document transformation performance (outside of longer term CQF work) then I think we can close this ticket. I investigated earlier and did not find any quick wins. |
| Comment by David Storch [ 18/Oct/19 ] |
|
james.wahlin do you have time to investigate during this sprint? |
| Comment by James Wahlin [ 02/Jul/19 ] |
|
We plan to address the slowness caused by document transformation between mutablebson, BSONObj and Document/Value as part of the Common Query Framework roadmap. The only step remaining here is to confirm whether the Document/Value project can provide any nearer-term improvements. martin.neupauer - I am assigning this ticket to you to wrap up as I will be OOO next week. If there are no quick-wins to be had via Document/Value then feel free to close this ticket. |
| Comment by James Wahlin [ 25/Jun/19 ] |
|
Reviewing the perf data confirms that the majority of time spent is in creating and destroying document elements. Pipeline updates pay a substantial cost in that they go through a BSON -> Document/Value -> mutablebson -> BSON transformation. I suspect that the way forward here will be to replace mutablebson with Document/Value across the update system. As part of this we can consider replacing use of the ObjectReplaceExecutor with a mechanism specific for pipeline update. |
| Comment by James Wahlin [ 24/Jun/19 ] |
|
Initial profiling efforts show that CPU time is split across transforming Document to BSONObj, performing the pipeline transformation and applying the replacement style update to the provided mutablebson document. Applying the changes from the Document/Value project did not improve performance. Next steps are: |