[SERVER-59240] Review WiredTiger default settings for engine and collections. Created: 11/Aug/21 Updated: 23/Jan/24 |
|
| Status: | Blocked |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Luke Pearson | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | or | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Storage Execution
|
||||||||
| Sprint: | Execution Team 2022-02-21, Execution Team 2022-03-07, Execution Team 2022-03-21 | ||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
The execution layer opens WiredTiger with an eviction thread max and min of 4. Which was decided ~6.5 years ago in Additionally from a technological standpoint machines are generally faster and have more resources available. I am specifically interested in the eviction threads min / max configuration. Stressful workloads on large machines could utilize more than that, which would avoid pulling application threads into eviction as frequently. Other values of interest are:
I think since As for the work here, I think we'd need to tune the values, run perf tests, collate the data and then make a decision for which value is best. This is potentially a lot of work and could be split into tickets for each configuration. |
| Comments |
| Comment by Daniel Gomez Ferro [ 31/Mar/22 ] |
|
Sorry for the (very) late answer daniel.gottlieb. My understanding is that there's only one thread (the BackgroundSync) reading data from the primary and feeding it to the replication workers through the OplogBuffer, so the odds are even worse in that regard (1 vs 1024 threads). I ran a quick experiment with execution control enabled (PM-1723) that uses a FIFO queue to order WT operations and hopefully produce fairer results and avoid starvations, however the problem was still reproducible. It's possible that the starvation happened at other layers though. In any case, we decided to put this ticket into the backlog to investigate it properly at a later time since the benefits weren't immediate. |
| Comment by Daniel Gottlieb (Inactive) [ 11/Mar/22 ] |
You've already done much more research on MDB in this area than I have, so apologies if this isn't a useful idea to keep in mind. But in case the thought hasn't been propagated lately – it's typically assumed that primaries perform better than secondaries. I'm not sure how one would best isolate this, but I wonder if increasing the number of eviction threads has the consequence of starving replication worker threads. IIRC we use 8 or 16 replication worker threads (but notably, this is a fixed number much much smaller than the 1024 clients that are vying for the primary's attention). |
| Comment by Daniel Gomez Ferro [ 11/Mar/22 ] |
|
Our performance tests run with 8 cores, so I focused on the build that sets eviction threads = number of cores. I investigated one of the regressions, ParallelInsert-1024.Insert_W1_JTrue.34 that had -32% throughput. In this test we used to have durable lag (and hence replication lag) spiking up to 3 or 4 seconds at the start of some phases (for W1_JTrue and W1). With the increased eviction threads the durable lag increased to 5s and flow control kicked in, creating a large performance regression due to the high concurrency of the test (1024 threads). I couldn't figure out why the durable lag increases consistently. Another large regression happened on YCSB 60GB , -35% ops_per_sec during load. In this test it looks like there's cache thrashing at the WT level, threads spend more time reading data from disk into the cache, possibly because we are evicting pages more aggressively. |
| Comment by Daniel Gomez Ferro [ 03/Mar/22 ] |
|
Many workloads improved when setting the number of eviction threads to the number of cores, but there are some significant regressions too, specially for high latency percentiles: https://dag-metrics-webapp.server-tig.staging.corp.mongodb.com/perf-analyzer-viz/?evergreen_version=621f84999ccd4e75c3af2581&evergreen_base_version=sys_perf_ae0c9cf8327d54470175ac8a450df8f08e77578a I'm running another test with eviction worker threads set to half the available cores, to see if it would help with those specific workloads. |
| Comment by Louis Williams [ 02/Mar/22 ] |
|
daniel.gomezferro, let's raise the maximum number of eviction worker threads to the minimum of the number of available CPU cores and 20 (i.e. the maximum). And then we can run our performance workloads and see if there are any significant regressions. We should also open another ticket to consider changing these other parameters since that investigation will likely require much more analysis and time. CC josef.ahmad |
| Comment by Luke Pearson [ 11/Aug/21 ] |
|
I can try and dig up some help tickets with a stressed cache if that adds value to this ticket. I do understand that this would be fairly large chunk of work so if there isn't a need for it then this ticket can be closed or de-prioritized.
|