[SERVER-49857] ASAN Ubuntu 18.04 build variant did not symbolize its output Created: 24/Jul/20  Updated: 29/Oct/23  Resolved: 24/Aug/20

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.7.0, 4.4.2

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Ryan Egesdahl (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-72356 llvm symbolizer broke with v4 Closed
is related to SERVER-50499 Debuginfo archive is not extracted fo... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Dev Platform 2020-08-10, Dev Platform 2020-08-24, Dev Platform 2020-09-07
Participants:
Linked BF Score: 42

 Description   

Despite the /opt/mongodbtoolchain/v3/bin/ directory being added to the PATH environment variable for the started the mongos process, ASan did not symbolize its output. Could /opt/mongodbtoolchain/v3/bin/llvm-symbolizer actually not be present?

https://clang.llvm.org/docs/AddressSanitizer.html#symbolizing-the-reports

PATH=/data/mci/89468aa1ebaf95a9731196ddd97623aa/src:/data/multiversion:/data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin:/data/mci/89468aa1ebaf95a9731196ddd97623aa/venv/bin:/opt/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/opt/node/bin:/opt/mongodbtoolchain/v3/bin:false INSTALL_DIR=/data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin /data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin/mongos --setParameter enableTestCommands=1 --setParameter logComponentVerbosity={'transaction': 3} --setParameter testingDiagnosticsEnabled=true --configdb=config-rs/localhost:20000 --port=20005
[ShardedClusterFixture:job0:mongos] 2020-07-22T09:29:42.224+0000 mongos started on port 20005 with pid 51285.
...
[ShardedClusterFixture:job0:mongos] 2020-07-22T09:30:32.905+0000     #18 0x5555642a23da in mongo::ThreadPool::_doOneTask(std::unique_lock<mongo::latch_detail::Latch>*) (/data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin/mongos+0x6ba73da)
[ShardedClusterFixture:job0:mongos] 2020-07-22T09:30:32.906+0000     #19 0x5555642a00f5 in mongo::ThreadPool::_consumeTasks() (/data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin/mongos+0x6ba50f5)
[ShardedClusterFixture:job0:mongos] 2020-07-22T09:30:32.906+0000     #20 0x55556429f630 in mongo::ThreadPool::_workerThreadBody(mongo::ThreadPool*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/data/mci/89468aa1ebaf95a9731196ddd97623aa/src/dist-test/bin/mongos+0x6ba4630)



 Comments   
Comment by Githook User [ 14/Oct/20 ]

Author:

{'name': 'Ryan Egesdahl', 'email': 'ryan.egesdahl@mongodb.com', 'username': 'deriamis'}

Message: SERVER-50363

{A,UB}SAN build should be statically linked

The previous in SERVER-49857 added --link-model=dynamic to the{A,UB}

SAN build, which caused dependency cycles and missing symbols. We
moved to dynamic linking in later versions, but v4.4 does not have the
build infrastructure to support it and likely never will. The addition
was accidental, so it is removed.
Branch: v4.4
https://github.com/mongodb/mongo/commit/cf5e17ce73846dced71767e961a8f2d59039fe68

Comment by Githook User [ 13/Oct/20 ]

Author:

{'name': 'Ryan Egesdahl', 'email': 'ryan.egesdahl@mongodb.com', 'username': 'deriamis'}

Message: SERVER-49857 Explicit llvm-symbolizer path handling with

{A,T,UB}

SAN

The toolchain llvm-symbolizer was never actually in PATH despite the
toolchain being appended to it in evergreen.yml, causing confusion while
attempting to diagnose an apparent symbolization failure. This change
explicitly sets the path to llvm-symbolizer for all sanitizer build
variants and removes the last vestiges of the non-working discovery
method.

(cherry picked from commit 20ed5d51cb1c82597f65967be69b81e6e72c0413)
Branch: v4.4
https://github.com/mongodb/mongo/commit/20f26290c6e6d58a8ae40a1484f1eb9f2fa1dc83

Comment by Ryan Egesdahl (Inactive) [ 24/Aug/20 ]

Symbolizing is actually happening but the lack of debuginfo is causing line numbers and source code not to be shown. We've just made discovering the correct llvm-symbolizer more robust here. A request to extract debuginfo during sanitizer builds has been opened as SERVER-50499.

Comment by Githook User [ 22/Aug/20 ]

Author:

{'name': 'Ryan Egesdahl', 'email': 'ryan.egesdahl@mongodb.com', 'username': 'deriamis'}

Message: SERVER-49857 Explicit llvm-symbolizer path handling with

{A,T,UB}

SAN

The toolchain llvm-symbolizer was never actually in PATH despite the
toolchain being appended to it in evergreen.yml, causing confusion while
attempting to diagnose an apparent symbolization failure. This change
explicitly sets the path to llvm-symbolizer for all sanitizer build
variants and removes the last vestiges of the non-working discovery
method.
Branch: master
https://github.com/mongodb/mongo/commit/20ed5d51cb1c82597f65967be69b81e6e72c0413

Comment by Andrew Morrow (Inactive) [ 21/Aug/20 ]

max.hirschhorn - I believe symbolization is happening, but it appears that the debug info files are not present so the symbolizer can't provide line numbers.

robert.guo - I thought we were always pulling down the debug info at the beginning of each task? For instance this task https://evergreen.mongodb.com/task/mongodb_mongo_master_ubuntu1804_debug_asan_disk_wiredtiger_patch_e8b0acc393d181a50301a10ae436dc580b335858_5f3586d03066156e86056d18_20_08_13_18_30_59##comparehashes=e8b0acc393d181a50301a10ae436dc580b335858&threads=all downloads the debug info here: https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_ubuntu1804_debug_asan_disk_wiredtiger_patch_e8b0acc393d181a50301a10ae436dc580b335858_5f3586d03066156e86056d18_20_08_13_18_30_59/0?type=T#L17

If we just unpacked it for sanitizer builds, I think line numbers would happen again.

Comment by Ryan Egesdahl (Inactive) [ 31/Jul/20 ]

It looks like what's happening here is that the variant_path_suffix we set on the build variant is not being added to PATH during execute_resmoke_tests for some reason. I can confirm that the symbolizer is used as intended when ASAN_SYMBOLIZER_PATH is set in the build variant:

https://logkeeper.mongodb.org/build/0d12db79832f9ae4295e4d9e259c6b94/test/5f249c6154f248721e04eba0?raw=1

I question the reason why we started adding *_SYMBOLIZER_PATH to the sanitizer build variants. It should be working if it were actually being added to the path, as I have confirmed locally. Something seems to be wrong with how we're using variant_build_path here, or with how it ends up functioning during resmoke execution. I'm following up that issue now to determine what the path forward here should be.

Comment by Ryan Egesdahl (Inactive) [ 30/Jul/20 ]

I think this might be related to not having ASAN_SYMBOLIZER_PATH in etc/evergreen.yml the way we do in SConstruct, and maybe we need to do that because we can't fully trust PATH to be right. I'm not entirely convinced of that, but it is a difference that might matter. I don't really know how to reproduce this and verify that fact yet, though. I'm working on that.

Generated at Thu Feb 08 05:21:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.