[SERVER-53952] Building with ninja + ASan poisons the build/install/ directory Created: 21/Jan/21 Updated: 29/Oct/23 Resolved: 02/Apr/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Build |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0, 4.4.7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Daniel Moody |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||
| Sprint: | Dev Platform 2021-02-08, Dev Platform 2021-02-22, Dev Platform 2021-03-08, Dev Platform 2021-03-22, Dev Platform 2021-04-05 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
I use the following two SCons invocations to generate ninja files.
I find that I must delete the contents of the build/install/ directory when switching from an ASan build to a non-ASan build.
(Using the wt tool as an example here because it compiles quickly.) |
| Comments |
| Comment by Githook User [ 28/Jun/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Daniel Moody', 'email': 'daniel.moody@mongodb.com', 'username': 'dmoody256'}Message: (cherry picked from commit db03ce4c42524ea65537d03f764100722f2f3e9e) | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 28/Jun/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Daniel Moody', 'email': 'daniel.moody@mongodb.com', 'username': 'dmoody256'}Message: (cherry picked from commit 27b41724ed822f1d938e1f90df4adccc2b7d4609) | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Moody [ 09/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
People from the mailing list are confirming my suspicions that reusing the ninja_log file with different ninja files is dangerous: https://groups.google.com/g/ninja-build/c/Zpi2A2bOyIo | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 05/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Daniel Moody', 'email': 'daniel.moody@mongodb.com', 'username': 'dmoody256'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 02/Apr/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Daniel Moody', 'email': 'daniel.moody@mongodb.com', 'username': 'dmoody256'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Moody [ 26/Mar/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
This actually is not caused by inherent mtimes problems, its caused because each ninja variant keeps its own logs of output mtimes. This causes the interaction of the ninja variants to play out like this:
The inherent problem is that ninja checks mtimes from disk for inputs and checks mtimes from its log for outputs, assuming it is the only one building the outputs and therefore can keep track of those output mtimes. Anyways I am pretty sure ninja does not support two build logs building the same set of files. Two ways I can think of to fix this:
| |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ryan Egesdahl (Inactive) [ 02/Feb/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I think this is a problem due to a subtlety that Ninja just isn't equipped to handle. What seems to have happened is that libshim_allocator.so has a dependency on an ASAN symbol because it was compiled with ASAN. We don't (and cannot) reflect the dependency on ASAN in the build graph, so when the next Ninja build is run without ASAN, Ninja doesn't know it needs to recompile libshim_allocator.so to remove that dependency. libshim_allocator.so is also at the root of the dependency graph, so nothing you're changing is going to cause it to be rebuilt. However, nearly every binary has a dependency on libshim_allocator.so, so when the dynamic linker tries to load it, it finds a dangling symbol dependency on ASAN that isn't associated with a DT_NEEDED*. SCons doesn't cause this problem because it knows to rebuild and reinstall libshim_allocator.so when the compiler or its arguments change. Ninja, on the other hand, just compares file modification times and isn't looking at md5sums or compiler argument hashes. It sees the file exists, is newer than build.ninja, and wasn't dependent on anything else it rebuilt, so it assumes nothing needs to be done. I would not be surprised to learn that you are sometimes getting the converse case where an ASAN executable is loading a non-ASAN libshim_allocator.so without a problem. The real issue is that you have two builds that Ninja doesn't know how to differentiate sharing the same location. When you are compiling with different compilers or compiler options with Ninja, you should be installing into different places with INSTALL_DIR or removing build/install before rebuilding because otherwise Ninja will happily pollute the install prefix if you let it. I am not sure there is a way to prevent install prefix pollution that will be workable with Ninja. If, however, you install your ASAN and non-ASAN builds (as well as builds with different compilers) into different locations - or, alternatively, leverage ccache to manage your build artifacts and just unconditionally remove build/install every time you switch to a new compiler or build type - you should not see this problem occurring. * This happens because the compiler statically links libasan by default, and __asan_init is a global symbol reference to an initializer. Because libshim_allocator.so doesn't reference any ASAN symbols and doesn't have its own initializer, the only way the dynamic loader can possibly resolve that reference is if the binary has it. Therefore, when libshim_allocator.so is built with ASAN, the executable loading it must also be built with ASAN, or the dynamic loader will fail to resolve the symbol reference. Interestingly, you could probably use a non-ASAN libshim_allocator.so with an ASAN executable without any problems. I made some changes to library shims in --build-tools=next that don't put any shim libraries without symbols on the linker command line, which prevents them from being a runtime dependency. That would effectively have prevented what you're seeing. But it doesn't fix the larger install prefix pollution issue, so it's not a real solution to your problem. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 21/Jan/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I believe that the issue here is related to the fact that Ninja only uses mtime, with all of the issues that entails. I'll do some further digging to confirm that hypothesis. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 21/Jan/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
This issue does not reproduce with plain SCons. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 21/Jan/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
I've reproduced this on both master and v4.4, after removing icecream and ccache just to be sure they weren't a factor. The problem actually looks rather severe:
I'm going to see if I can reproduce this in plain SCons next so we can determine if this is a bug in the transitive install dependency scanner in SCons, or in how those dependencies end up expressed to Ninja. | |||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Morrow (Inactive) [ 21/Jan/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks max.hirschhorn, that definitely shouldn't be happening. We will take a look. |