[SERVER-71213] High variance in PriorityTicketHolder microbenchmark runs Created: 09/Nov/22 Updated: 27/Oct/23 Resolved: 11/Nov/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Haley Connelly | Assignee: | Jordi Olivares Provencio |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Sprint: | Execution Team 2022-11-14 | ||||||||
| Participants: | |||||||||
| Description |
|
Currently, we are seeing a case where the ticketholder_bm benchmark performance of the PriorityTicketHolder degrades significantly on outlier runs. In general, the PriorityTicketHolder microbenchmarks perform comparable to that of the SemaphoreTicketHolder, consistently surpassing the SemaphoreTicketHolder performance with benchmark runs of 1024 threads. After performing some analysis, we discovered that outlier runs with poor performance were spending a significant amount of time (~29% compared to ~2% in standard runs) in _pthread_rwlock_rdlock. We suspect this could be due to the use of the shared_mutex, and sub-optimal waits for shared readers. |
| Comments |
| Comment by Haley Connelly [ 16/Nov/22 ] | |||||||||
|
daniel.gomezferro@mongodb.com you make a great point, and thanks for meeting offline to investigate this further. Conclusion Recommendation Analysis
Additionally, we had performance data from the background_index_construction genny workload which aims to test index build performance when there is write contention. To make the comparison accurate, we compared the SemaphoreTicketHolder performance to the PriorityTicketHolder's performance when index builds are normal priority by default (only normal priority operations were run in the test).
Theory | |||||||||
| Comment by Jordi Olivares Provencio [ 11/Nov/22 ] | |||||||||
|
That very well might be the case. Having a bit of contention with the locks might explain it. It would align with the pthread reader lock jumping dramatically in time spent there. | |||||||||
| Comment by Daniel Gomez Ferro [ 11/Nov/22 ] | |||||||||
|
Maybe you don't even need to get enqueued, it could just be a case of taking the lock almost always un contended VS having some contention, and it appears there's pretty low contention overall with the low number of enqueues. | |||||||||
| Comment by Jordi Olivares Provencio [ 10/Nov/22 ] | |||||||||
|
This is interesting, but doesn't seem to be the case unfortunately:
I obtained the number for Enqueued from the ticketholder statistics for normal queue which is where all operations are going. It raises a very interesting question though, which is why are we enqueueing so little in the benchmark | |||||||||
| Comment by Daniel Gomez Ferro [ 10/Nov/22 ] | |||||||||
|
Are you monitoring how many times you ended up in the queues? A problem you might be running into is that for fast cases you are (almost) never queueing (a thread takes the ticket, sleeps, releases and reacquires without contention) vs slow cases were you end up queueing and dequeuing a lot. |