[SERVER-62313] Compiling pre-processed sourcecode with CCache loses path information Created: 29/Dec/21 Updated: 29/Oct/23 Resolved: 29/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.0.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Jackson | Assignee: | Andrew Morrow (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Sprint: | Dev Platform 2022-01-24, Dev Platform 2022-03-21, Dev Platform 2022-04-04 | ||||||||
| Participants: | |||||||||
| Description |
|
Compiling with scons+ccache+UBSAN results in binaries which do not appear to respect the contents of etc/ubsan.denylist.
Checking the ccache diagnostics we can see entries like:
Note that this compiler invocation created build/optdebug/third_party/zstandard-1.4.4/zstd/lib/compress/zstd_compress.dyn.o and consumed /run/user/1000/ccache-tmp/tmp.cpp_stdout.G03F6m.i. Because clang is not running against the true source tree, I believe clang cannot apply the following relative path based suppression from etc/ubsan.denylist:
I believe CCache chooses to run this way because we've disabled CCACHE_CPP2. According to CCache's docs:
When that logic is excised, the following invokations results in functioning test binaries:
|
| Comments |
| Comment by Githook User [ 28/Mar/22 ] |
|
Author: {'name': 'Andrew Morrow', 'email': 'acm@mongodb.com', 'username': 'acmorrow'}Message: |
| Comment by Daniel Moody [ 24/Mar/22 ] |
This seems a little over engineered, how many cases are we expecting to support where we need to disable NOCPP2? I like the generator idea and -o idea though.
Ninja would not be able to have that level of control. Its a bit of an edge case I am referring to, so overall I agree with what you are saying. I doubt a developer would set the CCACHE_CPP2 environment variable explicitly, but switching to the -o option would open a hole for confusion. |
| Comment by Andrew Morrow (Inactive) [ 24/Mar/22 ] |
|
Another advantage of the above generator driven scheme is that it would make the tool a little more robust, since CCACHE_EXTRAFILES wouldn't need to be set before the tool was loaded, and it could also be customized in a cloned Environment and the tool would still do the right thing. |
| Comment by Andrew Morrow (Inactive) [ 24/Mar/22 ] |
|
daniel.moody - I was thinking about the fact that we already had CCACHE_EXTRAFILES in order to let ccache know about files to hash. I just realized though that cache also allows us to explicitly pass option values on the command line with the -o syntax. So maybe we can simplify and harden things, by moving the handling of run_second_cpp (which is what CCACHE_CPP2 vs CCACHE_NOCPP2) into a generator, which can then also consult CCACHE_EXTRAFILES while generating the action string to decide about run_second_cpp. We could also give CCACHE_EXTRAFILES more structure, so you can tag a file as being "pathy" and therefore needing to suppress NOCPP2. It'd be a little refactor, but maybe for the best? I never liked that we needed to muck around with the `ENV` anyway. One caveat is that ccache honors environment variables even over explicit command line arguments (an interesting choice). But we can probably deal with that by just refusing to load the ccache tool if we see anything in env[ENV] that starts with CCACHE_. Thoughts? |
| Comment by Andrew Morrow (Inactive) [ 23/Mar/22 ] |
|
OK I think we have the shape of a plan for this one. I think I will 1) provide a customization point to suppress NOCPP2 and enable that customization point when sanitizers are in play, and 2) add guardrails disallowing icecream when sanitizers are in play, but 3) provide a --do-as-i-said flag that suppresses the guardrails in 2. |
| Comment by Daniel Moody [ 23/Mar/22 ] |
Yeah just performance, CCACHE's default (CCACHE_CPP2=1, the NO is removed meaning double preprocessing) is different then what we set. The cost is the time it takes to preprocess all cache misses, which is not insignificant, but also not unbearable.
We can't control it easily or currently. We'd need to setup a dedicated cluster, which I think we intend to anyways. Heres the ticket around the behavior, which also includes the fixing icecream commit. I think disallowing them by default is probably good, but I use icecream with sanitizers and I usually end up not having issues, because I assume most of the cluster is on a newer icecream now-a-days. It might be flexible if we had a force option with a disclaimer maybe? I usually just delete third party build folder if I see issues then rebuild without icecream so most of my sanitizer build is built with icecream.
|
| Comment by Andrew Morrow (Inactive) [ 23/Mar/22 ] |
Yes, you are right, I got tangled up in the permutations.
I think that as long as we are confident that opting into CCACHE_NOCPP2=1 was only ever an optimization, and not something that we needed to do for correctness (and I think your wiki table supports that), then we can go ahead and build the mechanism that lets the sanitizer builds ask that the ccache tool to not enable the NOCPP2 optimization. I don't think that would be too hard. Can you confirm that there is no correctness motivation behind CCACHE_NOCPP2=1, only performance?
I don't think we have the ability to constrain on the icecream version for remotes, right? What is the behavior that we require from icecream 1.3? Should we disallow sanitizer builds with icecream until we can enforce 1.3? Or should we just make the ccache related fix and hope for the best when icecream is in play? Most engineers aren't doing local sanitizer builds that often, but it does hurt to take away icecream. |
| Comment by Ryan Egesdahl (Inactive) [ 23/Mar/22 ] |
|
daniel.moody Your last concern should be addressed by BUILD-14609. Until that point, I don't think we can be assured that any of the Icecream build hosts are >1.3. |
| Comment by Daniel Moody [ 23/Mar/22 ] |
|
It possible CCACHE has fixed this behavior in a new version, I have not looked at that possibility. |
| Comment by Daniel Moody [ 23/Mar/22 ] |
|
But the command line in the reproducer shows GCC, ccache, and no icecream This is a sanitizer and is using clang I thought? CCACHE_NOCPP2=1 is in play could we fix this by arranging things so we don't use the CCACHE_NOCPP2=1 option when we are using the sanitizer denylists? I think this is a pretty good solution, I know you referenced the wiki link, but those were issues were around GCC, and this case we are using clang. So I think there is an entirely new issue here, related to denylist and ccache using preprocessed files out of the source tree. Do we know if we also have intrinsic difficulties with honoring the denylist files based on filepath due to icecream, for both the with and without ccache cases? I think we have issues with both, but both have a fix. CCACHE we need to not use the CCACHE_NOCPP2=1 in the sanitizer case. ICECREAM we need to make sure the local and all remotes are using >1.3. |
| Comment by Ryan Egesdahl (Inactive) [ 23/Mar/22 ] |
|
acm daniel.moody Keep in mind, one of the problems we were having when implementing this behavior is that we were dealing with mutual dependencies between specific versions Icecream and ccache. I don't think our assumptions need hold in either case anymore, especially with the v4 toolchain, so it's possible that we just need to look at this anew. |
| Comment by Andrew Morrow (Inactive) [ 23/Mar/22 ] |
|
A few follow-up questions:
|
| Comment by Daniel Moody [ 23/Mar/22 ] |
|
acm for further explanation there are only two suboptimal conditions we are currently supporting due to the bugs mentioned in the wiki link.
|
| Comment by Daniel Moody [ 23/Mar/22 ] |
|
Yeah, but CCACHE_NOCPP2=1 is an optimization for ccache so the compiler doesn't need to re-preprocess the source files in a cache miss scenario. For example ccache first preprocesses the source as an input in generation of a hash id to check the cache with. If there was a miss, well you already did some work by preprocessing so instead of invoking the normal compiler command, ccache will give the preprocessed source to the compiler. I'm not sure how much time is actually saved. |
| Comment by Ryan Egesdahl (Inactive) [ 23/Mar/22 ] |
|
acm I think you’re probably correct here. I know from some recent work in the Icecream code that there’s actually quite a bit of room for fixing any problems we might encounter. When we made these original changes, we had no control over the Icecream version, too. Now that we provide our own release, it might make sense to have a more definitive policy here. |
| Comment by Ryan Egesdahl (Inactive) [ 30/Dec/21 ] |
|
spencer.jackson On second thought, I don't know that simply adding the flag in CCFLAGS is going to help much because the issue is that ccache is doing the compile in a temporary directory, which is throwing off all the paths. We would need a more extensive workaround of what ccache is doing here, if it's possible to do. For now, the SCons cache is probably what you want to use here. |
| Comment by Ryan Egesdahl (Inactive) [ 29/Dec/21 ] |
|
spencer.jackson We deliberately disabled CCACHE_CPP2 because it doesn't play well in all scenarios - especially when Icecream is in play - and there are too many combinations to consider when setting it up. However, we are planning to explore using -ffile-prefix-map as part of other work where the compile path seems to matter. That would be a preferable solution here as well if it works. Could you please give it a try? It should be available for clang in the v4 toolchain. I think something like -ffile-prefix-map=$(pwd)=/ might work. |