[SERVER-54747] Vendor tcmalloc into MongoDB's third party sources Created: 24/Feb/21  Updated: 01/Mar/23

Status: Blocked
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Tammy Bailey (Inactive) Assignee: Backlog - Storage Engines Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-74523 Upgrade abseil to 2023-08-02 LTS Closed
Assigned Teams:
Storage Engines
Sprint: Storage - Ra 2021-03-08, Storage - Ra 2021-03-22
Participants:
Story Points: 5

 Description   

This ticket is to add the new TCMalloc to the MongoDB repository and integrate with SCons. The source code will be located in the directory src/third_party/tcmalloc. The TCMalloc source contains BUILD.bazel files for use with the Bazel build system; these Bazel files should be removed and the appropriate SCons files added.



 Comments   
Comment by Andrew Shuvalov (Inactive) [ 25/Jul/21 ]

Added a comment on performance here.

Comment by Andrew Morrow (Inactive) [ 19/Mar/21 ]

tammy.bailey I agree with bruce.lucas that it is fine to leave it in while the evaluation proceeds.

Comment by Bruce Lucas (Inactive) [ 19/Mar/21 ]

tammy.bailey, I don't think there's any need to do that. I think having that alternative implementation to, as you say, explore its potential, has been helpful and has generated useful discussion as input to the design decision when/if we're ready to make that decision.

Comment by Tammy Bailey (Inactive) [ 19/Mar/21 ]

acm I think there's been a rather large misunderstanding about the heap profiler implementation provided in this build.

We had some extra bandwidth, so provided an implementation of the heap profiler using the tcmalloc allocation sampling API both in the interest of producing a complete build and to explore the potential of the cleaner implementation, where the actual library code wouldn't need to be altered. This change was outside the scope of this ticket and was definitely not a design decision for the heap profiler by any means. The new tcmalloc library is still pending performance evaluation, so it's uncertain whether we'll even get to a point where a new heap profiler implementation will be needed.

I'm wondering if perhaps we should back the heap profiler changes out of this build to avoid future confusion?

Comment by Andrew Morrow (Inactive) [ 18/Mar/21 ]

bruce.lucas - The google/tcmalloc maintainers have made pretty clear that they don't intend to maintain ABI: https://github.com/google/tcmalloc/issues/2. Now ABI and API are admittedly different - the former is a much stricter bar. But an explicit statement not to keep ABI stable certainly suggests a willingness to iterate API and implementation/behavior. I'll note also that unlike the gperftools/tcmalloc, the google/tcmalloc project hasn't made any tags, created any branches, or issued any releases. It leads me to the conclusion that they intend to treat this project much more as something that lives at HEAD to reflect the state of the software within google. At the very least it probably means the end of offering or supporting --use-system-tcmalloc for the new tcmalloc.

Comment by Bruce Lucas (Inactive) [ 18/Mar/21 ]

For future reference, I'll record another question I have about the built-in heap profiling API for the new tcmalloc: do we have an understanding of the long-term stability of the interface and function from an upstream perspective?

Comment by Andrew Morrow (Inactive) [ 18/Mar/21 ]

Just a quick note, but IIUC the plan here is to replace the existing heap profiler (mongodb written one in src/mongo/util) with the heap profiler from the new tcmalloc. There are a few issues there. The first is that since we know we can't use the new tcmalloc on all platforms, we can't actually remove mongo the heap profiler: we will still need it on those platforms where the new tcmalloc can't (yet?) be used. Another concern is that the new tcmalloc profiler is based on using the abseil backtracing facilities. One of the advantages of the current mongodb heap profiler is that it re-uses the mongodb side backtracing facilities that we developed over the past two years. I'm somewhat reluctant to see us reintroduce a new backtracing facility, especially one that isn't under our control.

Comment by Tammy Bailey (Inactive) [ 09/Mar/21 ]

bruce.lucas do you mean an evergreen patch build (e.g. to run the testing) or something else?

Comment by Bruce Lucas (Inactive) [ 09/Mar/21 ]

alison.felizzi, tammy.bailey, I discussed this with the perf team to coordinate. For both the general performance testing and for taking look at the heap profiler ideally we would want an Evergreen build to work from. Can you make that available when you feel you have something that is ready to look at?

Comment by Alison Felizzi (Inactive) [ 09/Mar/21 ]

Quick status update:

I also spent a little bit of time validating the aarch64 and PPC builds.

Appending to the minimum compiler version discoveries in an earlier post:

tcmalloc guarantees Clang 9+ and GCC 9.2+ support as a minimum.

For aarch64, it's Clang 11+ as minimum. This is due to older versions of aarch64 Clang not having support for 'asm goto' statements with output constraints (as in Clang 9 and Clang 10). Unfortunately I couldn't get older versions of Clang to compile for aarch64 without manually hacking on the ASM statements (which I don't think is an ideal change to maintain). It is important to note that aarch64 is currently maintained on a best effort basis by the maintainers (as opposed to first tier support). Just in case if this influences the decision to maintain an aarch64 Linux build.

PPC is failing similarly on earlier compiler versions, as was discovered when doing the initial builds on x86 in the earlier post. Will need to further validate with access to a PPC machine since I've only been able to test with patch builds (having limited flexibility on swapping out and experimenting with different compiler versions).

Comment by Bruce Lucas (Inactive) [ 05/Mar/21 ]

tammy.bailey, understood. I'll work with the performance team to see if we can take a look to get an early start on that discussion in the context of their work. No need at this point for any additional builds beyond that from you.

Comment by Tammy Bailey (Inactive) [ 05/Mar/21 ]

bruce.lucas The first goal of this project - and the intent of this ticket - is to provide a build to the performance team so they can evaluate how the new library performs against workloads that reproduce excessive memory fragmentation. Supporting the heap profiler in a way that meets the needs and requirements of the server team is indeed a future goal of the project, but our understanding (and please correct me if I'm mistaken) was that it was not required for the perf team to perform their evaluation. Thus we performed the most straightforward implementation possible to meet the first goal. We are fully aware that there is discussion to be had around what the final heap profiler implementation should look like with respect to the new library. That said, we are happy to provide an Evergreen build if you'd like to experiment with the new tcmalloc sampling API!

Comment by Bruce Lucas (Inactive) [ 05/Mar/21 ]

Since the new tcmalloc library now provides its own allocation sampling API, I've implemented the heap profiler using this interface as opposed to using allocation hooks

The function and performance of the heap profiler is very important to support and triage as we use it not infrequently to debug memory allocation issues in the field, so we need to take a close look at this. Do you have an evergreen build we could use for that purpose?

Comment by Alison Felizzi (Inactive) [ 05/Mar/21 ]

Status update:

I've since added support for compiling the tcmalloc specific utility interfaces (in src/mongo/util) with the new tcmalloc library. I've additionally implemented a variant of the heap profiler that uses the new tcmalloc API (may still be buggy, currently doing some last validations). Some notes on the current approach:

  • Since the new tcmalloc library now provides its own allocation sampling API, I've implemented the heap profiler using this interface as opposed to using allocation hooks (as was done with the gperf library). I felt this may have been a cleaner approach instead of managing a custom malloc hook interface inside the tcmalloc library. If preferred I can implement it with allocation hooks if we want to keep the two approaches consistent.
  • Due to their differing API's, I've currently created two variants of each tcmalloc-specific util file (one for gperf and the other for tcmalloc). There's currently a fair bit of duplication and it may be worth to find a nice set of abstractions between the two libraries. I currently avoided doing this for the purpose of moving forward.

I'll be further working on cleaning up the branch however this is a good point to start sharing the tcmalloc branch (on my mongo fork) so others can now checkout the work and experiment.


The branch is found here: https://github.com/alisonfel/mongo/tree/server-54747-tcmalloc

  • To compile per-cpu tcmalloc you can initialize the build with the following command:
    • ./buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars --allocator=tcmalloc --ninja generate-ninja ICECC=icecc CCACHE=ccache
    • (emphasis on using the gcc toolchain as discussed above)
  • To compile per-thread tcmalloc:
    • ./buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars --allocator=tcmalloc-legacy --ninja generate-ninja ICECC=icecc CCACHE=ccache
  •  To compile gperf tcmalloc:
    • ./buildscripts/scons.py --variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars --allocator=tcmalloc-legacy --ninja generate-ninja ICECC=icecc CCACHE=ccache

Note: This work has only been tested on an x86_64 Linux platform (will further work on the ARM64 and PPC platforms).

Please let me know if there are any issues with compiling and running, will happily help!

Comment by Alison Felizzi (Inactive) [ 03/Mar/21 ]

Status update:

I've now have tcmalloc compiling and linking in with the mongodb server binaries. The current exception is that the build skips over the tcmalloc specific utility interfaces (src/mongo/util).

Some of the current limitations and compromises of the tcmalloc build, which will need to be addressed before production, include:

  • tcmalloc guarantees Clang 9+ and GCC 9.2+ support as a minimum.
    • The default mongodb toolchain minimums seem to be Clang 7 and GCC 8.2. This may need to be updated if we want to compile the new tcmalloc by default.
    • I've tried compiling it with lower versions of Clang ie. with the mongodb toolchain. This ultimately fails due to missing ASM features. Thus Clang 9 is a minimum requirement.
    • I've successfully compiled it with GCC 8.2 (using the default mongodb toolchain) and it seems to run without crashing (limited testing so far). Though it may still be ideal to match tcmalloc's official requirements to avoid potential future regressions.
  • I've needed to check out the most recent upstream revision of Abseil (as mentioned above) into the third party sources for tcmalloc to build against. tcmalloc generally keeps up to date with the latest version of abseil (its upstream code reflects any changes in upstream abseil).
  • I can compile tcmalloc in per-cpu and per-thread caching mode. The default is per-cpu caching mode.
    • As a temporary solution I've modified the `–allocator` flag in SCons to point to the following options:
      • 'tcmalloc' : Compile the new tcmalloc in per-cpu caching mode
      • 'tcmalloc-legacy': Compile the new tcmalloc in per-thread caching mode
      • 'tcmalloc-gperf': Compile the gperf tcmalloc (as we currently do)
      • 'auto': Compile with the 'tcmalloc' option if on Linux (non-Android), otherwise 'tcmalloc-gperf' (for Windows etc)
    • Will re-evaluate the flag options at a later point. As I currently understand we may want to roll 'tcmalloc-gperf' and 'tcmalloc-legacy' into the one option, switching between the two libraries depending on the platform.

Moving forward, my current focus is trying to compile the tcmalloc specific utility interfaces in src/mongo/util with the new tcmalloc library. I'll be trying to determine what parts of the API and its surrounding options are still compatible.

Comment by Alison Felizzi (Inactive) [ 01/Mar/21 ]

Short status update:

I've been focusing on developing a build of the new tcmalloc in the MongoDB repository by adding the necessary SCons support. Besides creating a new SConscript file for the vendored tcmalloc repository, I've needed to patch in additional build support to some of tcmalloc's dependencies. TCMalloc is heavily dependent on abseil-cpp, which we additionally vendor in our third party modules (abseil-cpp-master). Our current SCons build of  abseil-cpp only covers a limited set of targets/sources however. The additional components tcmalloc is dependent on includes (in Bazel syntax):

"@com_google_absl//absl/base",
"@com_google_absl//absl/base:config",
"@com_google_absl//absl/base:core_headers",
"@com_google_absl//absl/base:dynamic_annotations",
"@com_google_absl//absl/debugging:leak_check",
"@com_google_absl//absl/debugging:stacktrace",
"@com_google_absl//absl/debugging:symbolize",
"@com_google_absl//absl/memory",
"@com_google_absl//absl/strings",
"@com_google_absl//absl/numeric:bits",
"@com_google_absl//absl/algorithm:container",
"@com_google_absl//absl/debugging:debugging_internal",
"@com_google_absl//absl/synchronization",
"@com_google_absl//absl/hash:hash",
"@com_google_absl//absl/strings:str_format",
"@com_google_absl//absl/time",
"@com_google_absl//absl/types:optional",
"@com_google_absl//absl/types:span",
"@com_google_absl//absl/functional:function_ref"

Our current abseil checkout is missing the following targets:

"@com_google_absl//absl/numeric:bits",
"@com_google_absl//absl/functional:function_ref"

The missing targets seem to be used widespread in tcmalloc (i.e I don't believe it would be worth patching out of tcmalloc's source). In order to properly build tcmalloc we may need to also update our abseil checkout to a more recent revision?

 

As a temporary solution, I'm going to work on a more recent version of abseil (locally) to continue development. I hope to have an initial build of tcmalloc together soon (1 or 2 days).

Generated at Thu Feb 08 05:34:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.