setup_spawnhost_coredump script unable to read core dumps when EngFlow remote execution was used

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Critical - P2
    • 8.3.0-rc0
    • Affects Version/s: None
    • Component/s: Testing Infrastructure
    • None
    • DevProd Build
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      EngFlow remote execution uses a container environment with different versions of system libraries than the Evergreen host. This inhibits GDB's ability to unwind the stack.

      $ gdb -nx ./bin/db_disagg_storage_test ./dump_1758723036501584558.core
      GNU gdb (GDB) 16.3
      ...
      warning: Build-id of /lib64/libssl.so.3 does not match core file.
      
      warning: Build-id of /lib64/libcrypto.so.3 does not match core file.
      
      warning: Build-id of /lib64/libcurl.so.4 does not match core file.
      
      warning: Build-id of /lib64/libresolv.so.2 does not match core file.
      
      warning: Build-id of /lib64/libm.so.6 does not match core file.
      
      warning: Build-id of /lib64/libgcc_s.so.1 does not match core file.
      
      warning: Build-id of /lib64/libc.so.6 does not match core file.
      
      warning: Build-id of /lib/ld-linux-aarch64.so.1 does not match core file.
      
      warning: Build-id of /lib64/libpsl.so.5 does not match core file.
      
      warning: Build-id of /lib64/libgssapi_krb5.so.2 does not match core file.
      
      warning: Build-id of /lib64/libkrb5.so.3 does not match core file.
      
      warning: Build-id of /lib64/libk5crypto.so.3 does not match core file.
      
      warning: Build-id of /lib64/libkrb5support.so.0 does not match core file.
      
      warning: Build-id of /lib64/libkeyutils.so.1 does not match core file.
      ...
      Core was generated by `src/mongo/db/modules/atlas/src/disagg_storage/db_disagg_storage_test'.
      Program terminated with signal SIGABRT, Aborted.
      #0  0x0000ffff77a9b454 in ?? ()
      [Current thread is 1 (LWP 27)]
      (gdb) source ~/.gdbinit
      (gdb) bt
      #0  0x0000ffff77a9b454 in ?? ()
      #1  0x0000ffff77a9b440 in ?? ()
      Backtrace stopped: previous frame identical to this frame (corrupt stack?)
      

      Commands such as the following may be run to copy the system libraries from the /lib64 directory of the container environment onto the Evergreen host.

      $ CONTAINER=$(docker create quay.io/mongodb/bazel-remote-execution@sha256:06fc8103d2e2af9878ee7c5a792f32b3de1c8d68266c193f7ab35b7a136da519)
      $ mkdir ./sysroot
      $ sudo docker cp "$CONTAINER":/ ./sysroot/
                                    Successfully copied 387MB to /data/debug/sysroot/
      $ sudo chown -R $USER:$USER ./sysroot
      $ gdb -nx -ex 'set sysroot ./sysroot/' ./bin/db_disagg_storage_test ./dump_1758723036501584558.core
      ...
      (gdb) source ~/.gdbinit
      (gdb) bt 14
      #0  0x0000ffff77a9b454 in __pthread_kill_implementation () from ./sysroot/lib64/libc.so.6
      #1  0x0000ffff77a52320 [PAC] in raise () from ./sysroot/lib64/libc.so.6
      #2  0x0000ffff94cdbbf4 [PAC] in mongo::endProcessWithSignal (signalNum=signalNum@entry=6) at src/mongo/util/signal_handlers_synchronous.cpp:392
      #3  0x0000ffff94cde024 in mongo::(anonymous namespace)::abruptQuit (signalNum=6) at src/mongo/util/signal_handlers_synchronous.cpp:261
      #4  <signal handler called>
      #5  0x0000ffff77a9b454 in __pthread_kill_implementation () from ./sysroot/lib64/libc.so.6
      #6  0x0000ffff77a52320 [PAC] in raise () from ./sysroot/lib64/libc.so.6
      #7  0x0000ffff77a39224 [PAC] in abort () from ./sysroot/lib64/libc.so.6
      #8  0x0000ffff94ccc9e8 [PAC] in mongo::(anonymous namespace)::callAbort () at src/mongo/util/assert_util.cpp:94
      #9  0x0000ffff94ccdc74 in mongo::(anonymous namespace)::invariantFailedImpl<mongo::WrappedStdSourceLocation> (expr=0xffff900cf0b8 "_clients.empty()", loc=...)
          at src/mongo/util/assert_util.cpp:131
      #10 mongo::invariantFailed (expr=expr@entry=0xffff900cf0b8 "_clients.empty()", loc=...) at src/mongo/util/assert_util.cpp:148
      #11 0x0000ffff901198cc in mongo::invariantWithLocation<bool> (testOK=<optimized out>, expr=0xffff900cf0b8 "_clients.empty()", loc=...) at src/mongo/util/assert_util_core.h:94
      #12 mongo::ServiceContext::~ServiceContext (this=0x73ab7f94a240, __in_chrg=<optimized out>) at src/mongo/db/service_context.cpp:160
      #13 0x0000ffff901198e4 in mongo::ServiceContext::~ServiceContext (this=0x73ab7f94a240, __in_chrg=<optimized out>) at src/mongo/db/service_context.cpp:161
      (More stack frames follow...)
      

            Assignee:
            Zack Winter
            Reporter:
            Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: