[SERVER-21353] Building with system version of libraries Created: 09/Nov/15  Updated: 05/Apr/17  Resolved: 11/Nov/16

Status: Closed
Project: Core Server
Component/s: Build, JavaScript
Affects Version/s: 3.2.0-rc1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Marek Skalický Assignee: Mark Benvenuto
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File devel.patch     File system-libs.patch    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

scons all \
-j8 \
--use-system-tcmalloc \
--use-system-pcre \
--use-system-boost \
--use-system-snappy \
--use-system-zlib \
--use-system-stemmer \
--use-system-yaml \
--use-system-mozjs \
--nostrip \
--ssl \
--disable-warnings-as-errors \
%ifarch x86_64
--wiredtiger=on \
%else
--wiredtiger=off \
%endif
--experimental-decimal-support=off \
CCFLAGS="%{?optflags}" LINKFLAGS="%{?__global_ldflags}"

(tried in Fedora Rawhide)

Sprint: Platforms 15 (06/03/16), Platforms 16 (06/24/16), Platforms 17 (07/15/16), Platforms 18 (08/05/16), Platforms 2016-08-26, Platforms 2016-09-19, Platforms 2016-10-10, Platforms 2016-10-31, Platforms 2016-11-21
Participants:

 Description   

I am trying to build MongoDB with system versions of libraries. However it is not possible.

Some fixes was merged after r3.2.0-rc1 so it is merged in master. But also I would like to be able to build with system version of mozjs.
However it there is used unpublic functions in src/mongo/scripting/mozjs/valuereader.cpp. I was trying to use new c++11 functions from <locale> and <codecvt>, it is building but with running 'mongo' I get some Chinese chars instead of startup warning. What form of data is expected by jsstring?

Other thing is, that asio is only header "library" and it should be good to check a presence of this header.

Patch attached.



 Comments   
Comment by Mark Benvenuto [ 19/Jan/17 ]

In terms of third-party libraries, we have made minor compile fixes to many of them. In terms of substantive changes, in addition Asio, and MozjS, these are the ones we made feature changes.

  1. gperftools - We disable some of the caching for larger bucket sizes.
  2. mozjs - Additional work to handle x64 Solaris and Linux ARM64 Virtual Address space layout (SERVER-24400, SERVER-22927).
  3. S2 - We have made minor random enhancements.

For Gperftools, we should be doing a configure check for the GetThreadCacheSize function now or simply remove MONGO_HAVE_GPERFTOOLS_GET_THREAD_CACHE_SIZE altogether but this is not a significant issue. The Gperftools reporting enhancements we depend on are upstream, we just need to clean up our code

On the MozJS, yes the benefit is that portability across architectures is handled by the OS vendor instead the application vendor. We simply give up some robust OOM handling which really is a rather rare case.

On ICU, supporting the system version of this was a specific case we debated internally. From MongoDB's perspective, we wanted to ensure that the MongoDB data files were portable across machines and versions. For instance, the WiredTiger storage format is always little endian regardless of host architecture. For MMapV1, it is only officially supported on little endian (we know its files are not endian portable though) . For ICU, we persist the collation sort keys into our collections for performance (memcmp is cheaper then comparing strings using a collator at runtime). Since sort keys are not stable across ICU versions, we did not want to risk the user using a different version of ICU and getting wrong results as a result. This also means MongoDB will be very reluctant to upgrade ICU in the future without being able to provide side-by-side support for the collations in icu4c-57.1.

Most of the support for using system icu exists because we used the system version during development before we vendored ICU into our tree. We do have any immediate plans to fix or remove the flag though.

We release across a wide range of Linux variants (11 distro/version pairs at the moment across up to 4 architectures) so the library vendoring helps us isolate ourselves from these OS differences. On the other hand, we do not vendor all libraries such as OpenSSL (a pain point due to many different system versions), or the additional libraries our enterprise product depends on. All of us appreciate your patch for supporting OpenSSL 1.1.0.

Comment by Marek Skalický [ 18/Jan/17 ]

Thanks a lot for this explanation and sorry for late answer to closed bug...
Only asio and mozjs has MongoDB modifications? Or other bundles libraries have too?

I really appreciate that want to push you modifications back to upstream (e.g. https://github.com/chriskohlhoff/asio/issues/84#issuecomment-271927267).

Anyway it is sometimes useful to be able to use system versions of bundled libraries - even that it could cause some loose of minor functionality.
For example recovering from OOM <- bundled mozjs has code generated only for few platform even though it is working on more ones. And someone would like to try to build MongoDB for unsupported platforms. It is harder to patch build scripts than simply use system library...
Another example could be icu library. Why not to enable easily switching to system library? (and support same locales as whole system do...)

Comment by Mark Benvenuto [ 04/Nov/16 ]

Hi mskalick -

We want to provide some background on why we do not feel that it is advisable for you to build with the system versions of these components:

On Mozilla JavaScript:
MongoDB changed in the 3.2 release series to use Mozilla's SpiderMonkey JavaScript engine. We chose this engine over Google's V8 because it had greater platform portability, and had better out of memory handling.

One of our goals is to be able to cap the size of the JavaScript heap of a user's request whether it is in the shell or server. This prevents users from exhausting memory in a process with JavaScript, and with proper OOM handling, we can ensure that this is handled in a non-fatal way. Here is an example of a test that will exercise the JavaScript heap in both the shell and the server:

python buildscripts/resmoke.py --executor=serial_run jstests/serial_run/memory.js

Now, in order to ensure proper handling in SpiderMonkey, we install a custom memory allocator. This custom memory allocator is installed via a define at build time JS_USE_CUSTOM_ALLOCATOR. If this define is not set at build time though, SpiderMonkey will use the standard C memory management functions (malloc, etc). In this configuration, the test case above will crash both the MongoDB shell, and server.

On ASIO:
MongoDB made a decision to use the latest available source code for ASIO when we started on a project to do asynchronous networking, and then have forked it to support our needs. One of important areas for this was in the SSL integration.
This included a key change to disable ASIO's OpenSSL initialization so that it would remain MongoD's & MongoS's responsibility to initialize OpenSSL.

I am aware that this bundling and duplication of system libraries is against Fedora's packaging recommendations. For these two particular libraries, we need them bundled to deliver on important features.

References:
https://fedoraproject.org/wiki/Bundled_Libraries
https://fedoraproject.org/wiki/Bundled_Software_policy

We appreciate that this is a source of tension for packaging, and we are open to working with you further to understand how we can continue to develop MongoDB in ways that will work well with RedHat.

For now, we would like to close this ticket because we don't intend to make any local changes at this time.

Please feel free to reach out with any additional concerns.

Comment by Marek Skalický [ 13/May/16 ]

Added updated version of patch for MongoDB master which allows to build agains system version of mozjs and asio.

Comment by Marek Skalický [ 24/Nov/15 ]

Hi,

A direct ./configure; make; make install from mozjs-38.3.0 for me provides the js/CharacterEncoding headers, which in turn provide the wide char conversion functions.

Is there a reason you're going out of your way to avoid those? If it's just my wording, I was speaking about the lack of documentation, more than my expectation that the api was subject to change.

Currently I am using mozjs-38.2.0, however I think it is not a problem. Yes, headers are installed normally, but this symbol (JS::LossyUTF8CharsToNewTwoByteCharsZ) is not marked as public_api ( by JS_PUBLIC_API(void)) and by default symbols are hidden (pragma push(hidden) - in config/gcc_hidden.h). This is why I would be happy if MongoDB uses more standard API

The random Chinese you're seeing is a result of not doing exactly the utf8 -> ? encoding that spidermonkey wants. Which, as you've noticed, isn't utf8 -> utf16, at least as C++11 understands it.

Do you know the difference? What should be changed?

Comment by Mira Carey [ 17/Nov/15 ]

Marek,

Thanks for the patch, I'd been meaning to evaluate a system version of spidermonkey, but hadn't had time the time to take more than a cursory look.

First, some big caveats:

  • We haven't been testing with a system version of mozjs.
  • We do a bit of dance to avoid pulling in an nspr dependency that involves providing our own PosixNSPR.cpp. This doesn't work with a system NSPR
    • src/mongo/scripting/SConscript probably needs to avoid including PosixNSPR.cpp, otherwise you'll have duplicate symbols kicking around.
    • We use PR_CreateThread assuming it's our copy. A quick glance shows that we probably call it with the wrong arguments for scope PR_LOCAL_THREAD should be PR_GLOBAL_THREAD.
  • We rely on providing a jscustomallocator to handle out of memory errors. There are some use cases (stopping runaway js before it OOM's the whole server) that you won't be able to handle out of the box.

And a question:

  • What version of MozJS are you using?
    • A direct ./configure; make; make install from mozjs-38.3.0 for me provides the js/CharacterEncoding headers, which in turn provide the wide char conversion functions.
      • Is there a reason you're going out of your way to avoid those? If it's just my wording, I was speaking about the lack of documentation, more than my expectation that the api was subject to change.
    • The random Chinese you're seeing is a result of not doing exactly the utf8 -> ? encoding that spidermonkey wants. Which, as you've noticed, isn't utf8 -> utf16, at least as C++11 understands it.

In either case, it's a little too late in our release cycle to get this in for 3.2.0. I'll aim to get this in for 3.3 (the 3.4 dev branch) and look into a backport for 3.2.1 or 3.2.2.

-Jason

Generated at Thu Feb 08 03:57:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.