[SERVER-37590] Investigate performance implications of using WiredTiger read-once cursors Created: 12/Oct/18 Updated: 08/Apr/20 Resolved: 07/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Louis Williams |
| Resolution: | Done | Votes: | 0 |
| Labels: | nyc | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Sprint: | Storage NYC 2018-11-05, Storage NYC 2018-11-19 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
Using read_once=true cursors has shown to provide no performance benefits yet, at least in the context of background index builds. Investigate why that is the case, and decide when they should or should not be used. |
| Comments |
| Comment by Tess Avitabile (Inactive) [ 07/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks, louis.williams! And thanks for the detailed explanation of your perf investigation. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Louis Williams [ 07/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Closing because we are going to proceed with tess.avitabile/matthew.russotto
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Louis Williams [ 06/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've done a significant amount of testing to attempt to demonstrate a change in index build speed and read throughput, but found that challenging for several reasons. Instead, what I will provide is an examination of the changes in cache behavior, which indicate that despite neither significant demonstrable improvements or regressions for those metrics, I believe in general this change will still be desirable. First, why was it hard to demonstrate a performance improvement/regression? These are some outstanding issues with my workload
ProcedureI modified my test procedure as follows:
ResultsEach block of 3 columns shows data for read-once True, False, and the difference between the measurements, respectively. Each row shows, for each collection, how much data was in cache before, after, and the difference between the two measurements. "bytes currently in cache"
This can be interpreted as follows:
"unmodified pages evicted"
This can be interpreted as follows:
Additionally, I mentioned that I ran a single collection scan on the "hot" collection after each test. The results showed with very low variance over 5 trials that collection scans were in fact faster because more "hot" data was already in cache.
ConclusionGiven this data, I think it is safe to conclude that read-once cursors do provide some measurable benefit to reduce cache thrashing, and do not appear to provide significant disadvantages. The data implies they behave exactly as we would expect, so my belief is that we should plan on using read-once cursors by default where believe they may be beneficial. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Gottlieb (Inactive) [ 03/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I also ran louis.williams' linked patch and noticed there are no read requests going to disk during the run. I believe the OS filesystem cache is effectively keeping everything in memory. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 03/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
A couple of other things to consider: If the hot collection fits entirely in the o/s filesystem cache, then the cost of a WT cache miss may not be that high because it doesn't require i/o. Do you see a large difference in read throughput comparing no index build with index build? Maybe the high rate of activity on the hot collection keeps it in cache whereas the cold collection gets evicted preferentially because it's only read once. You might see a larger difference if you stop the reads, do the index build, then start the reads again - initially the rate should be lower while the hot collection fills the cache again. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Pasette (Inactive) [ 02/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Have you examined the ftdc data to confirm that the collection-specific cache stats on the cold table are behaving as you expect? Bruce or Kelsey can show you how to gather those stats and feed them into t2. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Louis Williams [ 02/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'll summarize here some tests that show insignificant results of using read_once cursors, at least for foreground index builds, which is similar to how future hybrid indexes will behave. I made a test that builds a foreground index on one "cold" collection and performs random point-lookups on another "hot" collection. The WT cache size was fixed at 1GB, and each collection was filled with docs of size 4k, and as many documents to fill 90% of the cache. I measured the total index build time (ms) and total ops/s across all client threads in that period of time. I ran 5 trials each of each read_once=true and read_once=false:
I analyzed the t2 and can confirm there is less cache pressure in general, at least cases where the cache exceeds 80%, but the results do not show anything significant. I am open to ideas for a testing configuration that may better demonstrate improvements, but at least in the scenario we would expect to see improvement, there is none. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 01/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for letting me know. I am not too concerned about bitrot for | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Milkie [ 01/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don't expect any further movement on this issue for another two sprints at least. We could expedite it if you feel that the work already staged for | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 01/Nov/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
milkie, when do you expect the Storage team to look into this issue? I'm interested in when we should plan to schedule |