Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13683

all threads must call ProfilerRegisterThread() for gperftools

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Cannot Reproduce
    • Affects Version/s: 2.6.0
    • Fix Version/s: None
    • Component/s: Diagnostics
    • Labels:
      None
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      see https://code.google.com/p/gperftools/issues/detail?id=335 where gperftool maintainers admit this is probably needed and the thread at https://groups.google.com/forum/#!topic/google-perftools/gEpE7_AdibI. This is a regression in gperftools as it wasn't needed in the near past. But many, many years ago it was needed when I was profiling mysqld. I added the call to ProfilerRegisterThread to to handleIncomingMsg.

      In this case I compiled the server with --user-cpu-profiler, called _cpuProfilerStart, ran a workload, called _cpuProfilerStop

      Show
      see https://code.google.com/p/gperftools/issues/detail?id=335 where gperftool maintainers admit this is probably needed and the thread at https://groups.google.com/forum/#!topic/google-perftools/gEpE7_AdibI . This is a regression in gperftools as it wasn't needed in the near past. But many, many years ago it was needed when I was profiling mysqld. I added the call to ProfilerRegisterThread to to handleIncomingMsg. In this case I compiled the server with --user-cpu-profiler, called _cpuProfilerStart, ran a workload, called _cpuProfilerStop

      Description

      To get CPU profiling from gperftools on recent Linux systems/kernels, all threads should call ProfilerRegisterThread. This issue is forked from https://jira.mongodb.org/browse/SERVER-6628

      Without adding the calls to ProfilerRegisterThread the gperftools profiler output file has nothing useful...
      pprof --pdf /path/to/mongod /path/to/profiler/output
      Using local file mongodb-src-r2.6.0/build/linux2/norm/mongo/mongod.
      Using local file prof2.260.
      No nodes to print

        Activity

        Hide
        schwerin Andy Schwerin added a comment - - edited

        Are you using the system version of tcmalloc or the version shipped in the mongodb source? This comes at an opportune time, as we're investigating upgrading our shipped tcmalloc.

        Show
        schwerin Andy Schwerin added a comment - - edited Are you using the system version of tcmalloc or the version shipped in the mongodb source? This comes at an opportune time, as we're investigating upgrading our shipped tcmalloc.
        Hide
        mdcallag Mark Callaghan added a comment -

        I use the version shipped with MongoDB (--allocator=tcmalloc) and confirmed by looking at scons build output.

        Show
        mdcallag Mark Callaghan added a comment - I use the version shipped with MongoDB (--allocator=tcmalloc) and confirmed by looking at scons build output.
        Hide
        acm Andrew Morrow added a comment -

        Hi Mark -

        A few of us have recently tested this on our dev machines, and in each case (after making some local edit to fix or work around SERVER-6628; we are reviewing a fix for that issue separately) we get profiles that contain data from all threads, not just the main thread, without needing to call ProfilerRegisterThread() from each thread. I know that we have tested on at least Ubuntu 12.04 and 14.04.

        I see that you confirm above that you are using --allocator=tcmalloc, however that only enables using tcmalloc: it does not select which version (vendored vs system) of the tcmalloc allocator you are using. That choice is controlled by the --use-system-tcmalloc and the --use-system-all flag. Can you confirm that you are also not using --use-system-tcmalloc or --use-system-all flags, or doing anything with the include or library search path which would cause the build system to select something other than the vendored gperftools?

        As an aside, another approach to profiling would be to use callgrind, though of course that comes with significantly higher overhead than the stack sampling profiler.

        Please let me know how else we can help.

        Thanks,
        Andrew

        Show
        acm Andrew Morrow added a comment - Hi Mark - A few of us have recently tested this on our dev machines, and in each case (after making some local edit to fix or work around SERVER-6628 ; we are reviewing a fix for that issue separately) we get profiles that contain data from all threads, not just the main thread, without needing to call ProfilerRegisterThread() from each thread. I know that we have tested on at least Ubuntu 12.04 and 14.04. I see that you confirm above that you are using --allocator=tcmalloc, however that only enables using tcmalloc: it does not select which version (vendored vs system) of the tcmalloc allocator you are using. That choice is controlled by the --use-system-tcmalloc and the --use-system-all flag. Can you confirm that you are also not using --use-system-tcmalloc or --use-system-all flags, or doing anything with the include or library search path which would cause the build system to select something other than the vendored gperftools? As an aside, another approach to profiling would be to use callgrind, though of course that comes with significantly higher overhead than the stack sampling profiler. Please let me know how else we can help. Thanks, Andrew
        Hide
        mdcallag Mark Callaghan added a comment - - edited

        I use the following and don't show all of the paths (BASE_CXX, BASE_CC, UWPATH) but trust me that it makes builds complicated because we have compiler/library toolchains so that builds done on my devel server will also work in production.

        scons --variant-dir=linux2/norm --nostrip --full --cxx=${BASE_CXX} --cc=${BASE_CC} --prefix=/data/mysql/$1 --allocator=tcmalloc  --opt=on --mongod-concurrency-level=db -j 10 --extrapath=${UWPATH} --extralib=unwind --use-cpu-profiler mongod

        From the build output libtcmalloc gets built from provided code:
        ranlib build/linux2/norm/third_party/gperftools-2.0/libtcmalloc_minimal.a

        From "nm mongod | grep -i tcmalloc" I see tcmalloc but I do not see it linked with libtcmalloc in the build output

        From the mongod command line I see this, and there is more but command line output is huge (all object files, few libraries). So I think I have included the right version of gperftools into mongod
        build/linux2/norm/third_party/gperftools-2.0/src/base/atom
        icops-internals-x86.o build/linux2/norm/third_party/gperftools-2.0/src/base/dynamic_annotations.o build/linux2/norm/third_party/gperftools-2.0/src/base/l
        ogging.o build/linux2/norm/third_party/gperftools-2.0/src/base/spinlock.o build/linux2/norm/third_party/gperftools-2.0/src/base/spinlock_internal.o
        ...

        Show
        mdcallag Mark Callaghan added a comment - - edited I use the following and don't show all of the paths (BASE_CXX, BASE_CC, UWPATH) but trust me that it makes builds complicated because we have compiler/library toolchains so that builds done on my devel server will also work in production. scons --variant-dir=linux2/norm --nostrip --full --cxx=${BASE_CXX} --cc=${BASE_CC} --prefix=/data/mysql/$1 --allocator=tcmalloc --opt=on --mongod-concurrency-level=db -j 10 --extrapath=${UWPATH} --extralib=unwind --use-cpu-profiler mongod From the build output libtcmalloc gets built from provided code: ranlib build/linux2/norm/third_party/gperftools-2.0/libtcmalloc_minimal.a From "nm mongod | grep -i tcmalloc" I see tcmalloc but I do not see it linked with libtcmalloc in the build output From the mongod command line I see this, and there is more but command line output is huge (all object files, few libraries). So I think I have included the right version of gperftools into mongod build/linux2/norm/third_party/gperftools-2.0/src/base/atom icops-internals-x86.o build/linux2/norm/third_party/gperftools-2.0/src/base/dynamic_annotations.o build/linux2/norm/third_party/gperftools-2.0/src/base/l ogging.o build/linux2/norm/third_party/gperftools-2.0/src/base/spinlock.o build/linux2/norm/third_party/gperftools-2.0/src/base/spinlock_internal.o ...
        Hide
        acm Andrew Morrow added a comment - - edited

        Thanks Mark -

        Your scons invocation looks reasonable to me, though I wonder if you actually need --extralib=unwind, since we should be adding it for you when building with --use-cpu-profiler: https://github.com/mongodb/mongo/blob/v2.6/src/third_party/gperftools-2.0/SConscript#L82-L84.

        In an effort to rule out one potential source of variation between your system and ours, could you perhaps try building against the vanilla system compiler and runtime libraries, rather than the ones from your production compatible toolchain?

        Thanks,
        Andrew

        Show
        acm Andrew Morrow added a comment - - edited Thanks Mark - Your scons invocation looks reasonable to me, though I wonder if you actually need --extralib=unwind, since we should be adding it for you when building with --use-cpu-profiler: https://github.com/mongodb/mongo/blob/v2.6/src/third_party/gperftools-2.0/SConscript#L82-L84 . In an effort to rule out one potential source of variation between your system and ours, could you perhaps try building against the vanilla system compiler and runtime libraries, rather than the ones from your production compatible toolchain? Thanks, Andrew
        Hide
        mdcallag Mark Callaghan added a comment -

        Switched back to Ubuntu 13.10 on my VM using the build command below. The gperftools output file created by mongod is ~160kb but pprof --text prints nothing and pprof --gv prints "No nodes to print". This is the same as I experienced before. Given that MongoDB already has a few lines of code compiled when gperftools is enabled, this only requires 1 or 2 extra lines to avoid uncertainty from problems with certain versions of Linux.

        scons --variant-dir=linux2/norm --nostrip --full --prefix=/home/mark/i/m260 --allocator=tcmalloc  --opt=on --mongod-concurrency-level=db  --use-cpu-profiler install

        Show
        mdcallag Mark Callaghan added a comment - Switched back to Ubuntu 13.10 on my VM using the build command below. The gperftools output file created by mongod is ~160kb but pprof --text prints nothing and pprof --gv prints "No nodes to print". This is the same as I experienced before. Given that MongoDB already has a few lines of code compiled when gperftools is enabled, this only requires 1 or 2 extra lines to avoid uncertainty from problems with certain versions of Linux. scons --variant-dir=linux2/norm --nostrip --full --prefix=/home/mark/i/m260 --allocator=tcmalloc --opt=on --mongod-concurrency-level=db --use-cpu-profiler install
        Hide
        acm Andrew Morrow added a comment - - edited

        Hi Mark -

        I just spun up a clean Ubuntu 13.10 image on Digital Ocean, and I was able to use the cpu profiler:

        # lsb_release -c
        Codename:	saucy

        I set up the packages necessary to build. You need to install google-perftools to get a copy of google-pprof since we don't build it. Then cloned mongo, checked out 2.6.0, built it using your options, and started mongod.

        # sudo apt-get install scons git build-essential libboost-all-dev g++-multilib libunwind8-dev google-perftools
        # git clone https://github.com/mongodb/mongo.git
        # git checkout r2.6.0
        # scons --variant-dir=linux2/norm --nostrip --full --prefix=/root/opt --allocator=tcmalloc  --opt=on --mongod-concurrency-level=db  --use-cpu-profiler -j20 install
        # mkdir -d /data/db
        # /root/opt/bin/mongod 

        Then in the mongo shell in another session:

        # /root/opt/bin/mongo
        > use admin
        > db.runCommand({_cpuProfilerStart: {profileFilename: './foo.prof'}});

        Then run some update load against mongod, then, in the mongo shell

        # /root/opt/bin/mongo
        > db.runCommand({_cpuProfilerStop: {}});

        I now have a foo.prof file with contents:

        # ls -la ./foo.prof
        -rw-r--r-- 1 root root 352217 May  2 14:54 ./foo.prof

        Running google-pprof (from the ubuntu google-perftools package) against this file gets me useful output:

        # google-pprof --text /root/opt/bin/mongod ./foo.prof | grep mongo::Update
        Using local file /root/opt/bin/mongod.
        Using local file ./foo.prof.
        Removing _L_unlock_13 from all stack traces.
              18   0.2%  61.9%      600   8.2% mongo::UpdateDriver::update
               7   0.1%  82.8%       97   1.3% mongo::UpdateDriver::~UpdateDriver
               6   0.1%  86.2%      180   2.5% mongo::UpdateDriver::parse
               6   0.1%  86.3%     1881  25.8% mongo::UpdateExecutor::execute
               6   0.1%  86.4%      215   3.0% mongo::UpdateExecutor::prepare
               6   0.1%  86.5%        6   0.1% mongo::UpdateLifecycleImpl::setCollection
               5   0.1%  89.6%      109   1.5% mongo::UpdateDriver::addAndParse
               5   0.1%  89.7%       15   0.2% mongo::UpdateResult::UpdateResult
               4   0.1%  92.4%       73   1.0% mongo::UpdateDriver::UpdateDriver
               4   0.1%  92.5%       39   0.5% mongo::UpdateDriver::makeOplogEntryQuery
               4   0.1%  92.5%       77   1.1% mongo::UpdateExecutor::UpdateExecutor
               3   0.0%  94.8%       40   0.5% mongo::UpdateDriver::clear
               3   0.0%  94.9%       14   0.2% mongo::UpdateExecutor::parseQuery
               3   0.0%  94.9%      185   2.5% mongo::UpdateExecutor::parseUpdate
               3   0.0%  95.0%       17   0.2% mongo::UpdateLifecycleImpl::UpdateLifecycleImpl
               1   0.0%  99.4%        1   0.0% mongo::UpdateDriver::refreshIndexKeys
               1   0.0%  99.4%        1   0.0% mongo::UpdateDriver::setContext
               1   0.0%  99.4%        1   0.0% mongo::UpdateDriver::setModOptions
               1   0.0%  99.4%      106   1.5% mongo::UpdateExecutor::~UpdateExecutor
               1   0.0%  99.5%        5   0.1% mongo::UpdateLifecycleImpl::getImmutableFields
               1   0.0%  99.5%        1   0.0% mongo::UpdateRequest::setMulti (inline)
               1   0.0%  99.5%        2   0.0% mongo::UpdateRequest::setUpsert (inline)
               0   0.0% 100.0%        1   0.0% mongo::UpdateDriver::context
               0   0.0% 100.0%        1   0.0% mongo::UpdateDriver::modOptions
               0   0.0% 100.0%        2   0.0% mongo::UpdateDriver::populateDocumentWithQueryFields
               0   0.0% 100.0%        3   0.0% mongo::UpdateLifecycleImpl::getIndexKeys
               0   0.0% 100.0%        1   0.0% mongo::UpdateRequest::setUpdates (inline)
               0   0.0% 100.0%        3   0.0% mongo::UpdateResult::toString

        So it looks to me like this works fine on Ubuntu 13.10.

        Show
        acm Andrew Morrow added a comment - - edited Hi Mark - I just spun up a clean Ubuntu 13.10 image on Digital Ocean, and I was able to use the cpu profiler: # lsb_release -c Codename: saucy I set up the packages necessary to build. You need to install google-perftools to get a copy of google-pprof since we don't build it. Then cloned mongo, checked out 2.6.0, built it using your options, and started mongod. # sudo apt-get install scons git build-essential libboost-all-dev g++-multilib libunwind8-dev google-perftools # git clone https://github.com/mongodb/mongo.git # git checkout r2.6.0 # scons --variant-dir=linux2/norm --nostrip --full --prefix=/root/opt --allocator=tcmalloc --opt=on --mongod-concurrency-level=db --use-cpu-profiler -j20 install # mkdir -d /data/db # /root/opt/bin/mongod Then in the mongo shell in another session: # /root/opt/bin/mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: './foo.prof'}}); Then run some update load against mongod, then, in the mongo shell # /root/opt/bin/mongo > db.runCommand({_cpuProfilerStop: {}}); I now have a foo.prof file with contents: # ls -la ./foo.prof -rw-r--r-- 1 root root 352217 May 2 14:54 ./foo.prof Running google-pprof (from the ubuntu google-perftools package) against this file gets me useful output: # google-pprof --text /root/opt/bin/mongod ./foo.prof | grep mongo::Update Using local file /root/opt/bin/mongod. Using local file ./foo.prof. Removing _L_unlock_13 from all stack traces. 18 0.2% 61.9% 600 8.2% mongo::UpdateDriver::update 7 0.1% 82.8% 97 1.3% mongo::UpdateDriver::~UpdateDriver 6 0.1% 86.2% 180 2.5% mongo::UpdateDriver::parse 6 0.1% 86.3% 1881 25.8% mongo::UpdateExecutor::execute 6 0.1% 86.4% 215 3.0% mongo::UpdateExecutor::prepare 6 0.1% 86.5% 6 0.1% mongo::UpdateLifecycleImpl::setCollection 5 0.1% 89.6% 109 1.5% mongo::UpdateDriver::addAndParse 5 0.1% 89.7% 15 0.2% mongo::UpdateResult::UpdateResult 4 0.1% 92.4% 73 1.0% mongo::UpdateDriver::UpdateDriver 4 0.1% 92.5% 39 0.5% mongo::UpdateDriver::makeOplogEntryQuery 4 0.1% 92.5% 77 1.1% mongo::UpdateExecutor::UpdateExecutor 3 0.0% 94.8% 40 0.5% mongo::UpdateDriver::clear 3 0.0% 94.9% 14 0.2% mongo::UpdateExecutor::parseQuery 3 0.0% 94.9% 185 2.5% mongo::UpdateExecutor::parseUpdate 3 0.0% 95.0% 17 0.2% mongo::UpdateLifecycleImpl::UpdateLifecycleImpl 1 0.0% 99.4% 1 0.0% mongo::UpdateDriver::refreshIndexKeys 1 0.0% 99.4% 1 0.0% mongo::UpdateDriver::setContext 1 0.0% 99.4% 1 0.0% mongo::UpdateDriver::setModOptions 1 0.0% 99.4% 106 1.5% mongo::UpdateExecutor::~UpdateExecutor 1 0.0% 99.5% 5 0.1% mongo::UpdateLifecycleImpl::getImmutableFields 1 0.0% 99.5% 1 0.0% mongo::UpdateRequest::setMulti (inline) 1 0.0% 99.5% 2 0.0% mongo::UpdateRequest::setUpsert (inline) 0 0.0% 100.0% 1 0.0% mongo::UpdateDriver::context 0 0.0% 100.0% 1 0.0% mongo::UpdateDriver::modOptions 0 0.0% 100.0% 2 0.0% mongo::UpdateDriver::populateDocumentWithQueryFields 0 0.0% 100.0% 3 0.0% mongo::UpdateLifecycleImpl::getIndexKeys 0 0.0% 100.0% 1 0.0% mongo::UpdateRequest::setUpdates (inline) 0 0.0% 100.0% 3 0.0% mongo::UpdateResult::toString So it looks to me like this works fine on Ubuntu 13.10.
        Hide
        ramon.fernandez Ramon Fernandez added a comment -

        Mark Callaghan, we have not been able to reproduce the behavior you describe on this ticket, so we're going to mark it as resolved. I'll post the detailed steps we tried in a separate comment, in case you're interested in following up.


        Regards,
        Ramón.

        Show
        ramon.fernandez Ramon Fernandez added a comment - Mark Callaghan , we have not been able to reproduce the behavior you describe on this ticket, so we're going to mark it as resolved. I'll post the detailed steps we tried in a separate comment, in case you're interested in following up. – Regards, Ramón.

          People

          • Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: