[SERVER-47368] Request to investigate compile times taking ~40 minutes on selected_tests patch builds Created: 06/Apr/20 Updated: 27/Oct/23 Resolved: 17/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Lydia Stepanek (Inactive) | Assignee: | [DO NOT ASSIGN] Backlog - Server Development Platform Team (SDP) (Inactive) |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Server Development Platform
|
||||
| Participants: | |||||
| Description |
|
On DAG team, we recently wrapped up the Targeted Test Selection project. As part of the investigation into whether selected_tests patch builds were running in < 1 hour (one of the goals of the project), I noticed that compile was consistently taking ~40 minutes on the patch builds I looked at (examples below). My understanding is that compile on patch builds should run in about ~20 minutes, so I thought it might be worth making a ticket about. There were a few patch builds that compiled in ~20 mins so I was wondering what the compile time difference is caused by.
All selected_tests patch builds in past week:
|
| Comments |
| Comment by Andrew Morrow (Inactive) [ 15/Apr/20 ] | ||||
|
Thanks brian.mccarthy for noticing - that does indeed explain why the file was missing. I don't think we need to revisit the decision to tie the cache to the image. It has been working well and most builds are fast. Every caching scheme will have occasional unlucky outliers. | ||||
| Comment by Andrew Morrow (Inactive) [ 10/Apr/20 ] | ||||
|
I see no reason why not. | ||||
| Comment by Cristopher Stauffer [ 10/Apr/20 ] | ||||
|
In speaking to Sam, can we just increase the cache size? | ||||
| Comment by Andrew Morrow (Inactive) [ 06/Apr/20 ] | ||||
|
I took a look at https://evergreen.mongodb.com/version/5e8761e057e85a1febcafdf4 and https://evergreen.mongodb.com/version/5e8676bb3627e001aa726295 to compare. I see from the SCons cache log that the slow build ended up with a cache hit rate of 47%, while the fast build had a cache hit rate of close to 100%. I think the variation here is entirely due to whether or not the build was well cached. Note that the slow one ran a day later, but both had the same base commit. Perhaps the RHEL 6.2 image shared cache is getting thrashed due to the number of builds and isn't retaining enough days info? brian.mccarthy, any way of knowing. As an example, in the fast build, we found session_catalog.o in the cache:
But in the later and slower build it was gone:
|