[SERVER-71608] Segmentation fault: in mongo::LockManager::lock Created: 24/Nov/22  Updated: 29/Oct/23  Resolved: 14/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 6.0.2, 6.0.3
Fix Version/s: 7.0.0-rc0, 6.0.5

Type: Bug Priority: Major - P3
Reporter: R K Assignee: Mark Benvenuto
Resolution: Fixed Votes: 3
Labels: crash, segfault
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File mongod.log     Text File mongodb60.log     File truss.log    
Issue Links:
Backports
Related
related to SERVER-73904 Update FreeBSD spidermonkey configura... Closed
is related to SERVER-73905 Implement getCurrentNativeThreadId fo... Closed
Assigned Teams:
Server Development Platform
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.0
Steps To Reproduce:

Build and run MongoDB 6.0.2 or 6.0.3 on FreeBSD amd64 or aarch64 and it segfaults after startup.

I'm maintaining ports of MongoDB 4.2 to 5.0 on FreeBSD and they run fine.

Attached the mongodb60.log. It is very well repeatable. Also got reports from others on the internet.

Participants:

 Description   

After startup of MongoDB 6.0 the server gets a segmentation fault.

Thread 1 received signal SIGSEGV, Segmentation fault.
Address not mapped to object.
absl::lts_20210324::container_internal::raw_hash_set<absl::lts_20210324::container_internal::NodeHashMapPolicy<mongo::ResourceId, mongo::PartitionedLockHead*>, absl::lts_20210324::hash_internal::Hash<mongo::ResourceId>, std::__1::equal_to<mongo::ResourceId>, std::__1::allocator<std::__1::pair<mongo::ResourceId const, mongo::PartitionedLockHead*> > >::find<mongo::ResourceId> (this=0x47dd6c90, key=..., hash=<optimized out>) at src/third_party/abseil-cpp-master/abseil-cpp/absl/container/internal/raw_hash_set.h:1372
1372    src/third_party/abseil-cpp-master/abseil-cpp/absl/container/internal/raw_hash_set.h: No such file or directory.
(gdb) bt
#0  absl::lts_20210324::container_internal::raw_hash_set<absl::lts_20210324::container_internal::NodeHashMapPolicy<mongo::ResourceId, mongo::PartitionedLockHead*>, absl::lts_20210324::hash_internal::Hash<mongo::ResourceId>, std::__1::equal_to<mongo::ResourceId>, std::__1::allocator<std::__1::pair<mongo::ResourceId const, mongo::PartitionedLockHead*> > >::find<mongo::ResourceId> (
    this=0x47dd6c90, key=..., hash=<optimized out>)
    at src/third_party/abseil-cpp-master/abseil-cpp/absl/container/internal/raw_hash_set.h:1372
#1  absl::lts_20210324::container_internal::raw_hash_set<absl::lts_20210324::container_internal::NodeHashMapPolicy<mongo::ResourceId, mongo::PartitionedLockHead*>, absl::lts_20210324::hash_internal::Hash<mongo::ResourceId>, std::__1::equal_to<mongo::ResourceId>, std::__1::allocator<std::__1::pair<mongo::ResourceId const, mongo::PartitionedLockHead*> > >::find<mongo::ResourceId> (
    this=0x47dd6c90, key=...)
    at src/third_party/abseil-cpp-master/abseil-cpp/absl/container/internal/raw_hash_set.h:1386
#2  mongo::LockHead::migratePartitionedLockHeads (
    this=this@entry=0x4865b300)
    at src/mongo/db/concurrency/lock_manager.cpp:390
#3  0x0000000004699794 in mongo::LockManager::lock (
    this=0x48722c60, resId=..., request=0x481126f0, 
    mode=<optimized out>)
    at src/mongo/db/concurrency/lock_manager.cpp:527
#4  0x00000000046a0140 in mongo::LockerImpl::_lockBegin (
    this=0x48016d00, opCtx=0x4864cc00, resId=..., 
    mode=1219626888)
    at src/mongo/db/concurrency/lock_state.cpp:910
#5  0x00000000046a22cc in mongo::LockerImpl::lock (
--Type <RET> for more, q to quit, c to continue without paging--    this=0x48016d00, opCtx=0x4864cc00, resId=..., 
    mode=mongo::MODE_X, deadline=...)
    at src/mongo/db/concurrency/lock_state.cpp:546
#6  0x00000000046978a8 in mongo::Lock::DBLock::DBLock (
    this=0xffffffffe860, opCtx=0x4864cc00, db=..., 
    mode=<optimized out>, deadline=..., 
    skipGlobalAndRSTLLocks=false)
    at src/mongo/db/concurrency/d_concurrency.cpp:226
#7  0x0000000003fe8964 in mongo::AutoGetDb::AutoGetDb (
    this=0xffffffffe848, opCtx=0x48b20788, dbName=..., 
    mode=mongo::MODE_X, deadline=..., secondaryDbNames=...)
    at src/mongo/db/catalog_raii.cpp:171
#8  0x0000000002c0d204 in mongo::(anonymous namespace)::logStartup (opCtx=0x4864cc00) at src/mongo/db/mongod_main.cpp:277
#9  mongo::(anonymous namespace)::_initAndListen (
    serviceContext=<optimized out>, listenPort=<optimized out>)
    at src/mongo/db/mongod_main.cpp:677
#10 0x0000000002c0b3fc in mongo::(anonymous namespace)::initAndListen (service=0x0, listenPort=<optimized out>)
    at src/mongo/db/mongod_main.cpp:850
#11 0x0000000002c06270 in mongo::mongod_main (argc=3, 
    argv=<optimized out>) at src/mongo/db/mongod_main.cpp:1548
#12 0x0000000002c05bc4 in main (argc=0, argv=0x48b20788)
    at src/mongo/db/mongod.cpp:47
(gdb)



 Comments   
Comment by Pawel Kraszewski [ 22/Feb/23 ]

I can confirm, FreeBSD 13.1-RELEASE-p6 can run mongodb60-6.0.4_1 built from ports on a default configuration. Thank you.

Comment by R K [ 17/Feb/23 ]

thank you for the time, effort and fix

Comment by Githook User [ 15/Feb/23 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-71608 Add support to jscustomallocator.cpp for FreeBSD

(cherry picked from commit c419698b577f7924d2d6fc6bd3f7bd922f1d0dd7)
Branch: v6.0
https://github.com/mongodb/mongo/commit/d0b1e392cca66b2c5b399b0d3fe5a65f1068cd33

Comment by Githook User [ 14/Feb/23 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-71608 Add support to jscustomallocator.cpp for FreeBSD
Branch: master
https://github.com/mongodb/mongo/commit/c419698b577f7924d2d6fc6bd3f7bd922f1d0dd7

Comment by Mark Benvenuto [ 10/Feb/23 ]

Sorry for the delay. This issue was recently bought to my intention. This issue appear a little bizarre at first but was caused by issues with MozJS on FreeBSD. This would show up as random memory corruption in various mongo programs that used MozJS. The test javascript shell would not even start before crashing. Valgrind was very helpful in ultimately finding the issue.

In the 6.0 timeframe, we upgraded to a new MozJS and for some reason, it depends on a quirk of our memory allocator (see https://github.com/mongodb/mongo/blob/master/src/mongo/scripting/mozjs/jscustomallocator.cpp#L40-L48) which is not implemented in the FreeBSD port. After implementing this support, MongoD and Mongo appear to work fine. I also switched the memory allocator to using the FreeBSD memory allocator (--allocator=system) because FreeBSD used jemalloc (it is just as good as gperftools).

I will make this fix to master (and a few other portability fixes) and then backport them to 6.0.

Output snippet from valgrind:

==39232== Invalid write of size 8
==39232==    at 0xFB89D7A: ProcessCodeSegmentMap::ProcessCodeSegmentMap() (WasmProcess.cpp:125)
==39232==    by 0xFB732F1: js_new<ProcessCodeSegmentMap> (Utility.h:517)
==39232==    by 0xFB732F1: js::wasm::Init() (WasmProcess.cpp:381)
==39232==    by 0xEF01371: JS::detail::InitWithFailureDiagnostic(bool) (Initialization.cpp:174)
==39232==    by 0xE9F4BDA: JS_Init() (Initialization.h:69)
==39232==    by 0xE9F16A8: mongo::mozjs::MozJSScriptEngine::MozJSScriptEngine(bool) (engine.cpp:71)
==39232==    by 0xE9F144A: mongo::ScriptEngine::setup(bool) (engine.cpp:56)
==39232==    by 0x9F1E50F: mongo::(anonymous namespace)::_initAndListen(mongo::ServiceContext*, int) (mongod_main.cpp:591)
==39232==    by 0x9F105D7: mongo::(anonymous namespace)::initAndListen(mongo::ServiceContext*, int) (mongod_main.cpp:950)
==39232==    by 0x9F0B747: mongo::mongod_main(int, char**) (mongod_main.cpp:1688)
==39232==    by 0x9F08C41: main (mongod.cpp:47)
==39232==  Address 0x1a020948 is 0 bytes after a block of size 72 alloc'd
==39232==    at 0x11FE8BC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==39232==    by 0xEA41798: mongo_arena_malloc(unsigned long, unsigned long) (jscustomallocator.cpp:205)
==39232==    by 0xEA421C9: js_arena_malloc(unsigned long, unsigned long)::$_0::operator()(void*, unsigned long) const (jscustomallocator.cpp:241)
==39232==    by 0xEA41A78: void* mongo::sm::wrap_alloc<js_arena_malloc(unsigned long, unsigned long)::$_0>(js_arena_malloc(unsigned long, unsigned long)::$_0&&, void*, unsigned long) (jscustomallocator.cpp:135)
==39232==    by 0xEA4199D: js_arena_malloc(unsigned long, unsigned long) (jscustomallocator.cpp:240)
==39232==    by 0xEA41AFB: js_malloc(unsigned long) (jscustomallocator.cpp:245)
==39232==    by 0xFB732CB: js_new<ProcessCodeSegmentMap> (Utility.h:517)
==39232==    by 0xFB732CB: js::wasm::Init() (WasmProcess.cpp:381)
==39232==    by 0xEF01371: JS::detail::InitWithFailureDiagnostic(bool) (Initialization.cpp:174)
==39232==    by 0xE9F4BDA: JS_Init() (Initialization.h:69)
==39232==    by 0xE9F16A8: mongo::mozjs::MozJSScriptEngine::MozJSScriptEngine(bool) (engine.cpp:71)
==39232==    by 0xE9F144A: mongo::ScriptEngine::setup(bool) (engine.cpp:56)
==39232==    by 0x9F1E50F: mongo::(anonymous namespace)::_initAndListen(mongo::ServiceContext*, int) (mongod_main.cpp:591)
==39232==

Comment by Eric Pierce [ 09/Feb/23 ]

I hope this helps pinpoint research on this, as I would very much like to see this working as well.  I was able to get 6.0.2 running on FreeBSD (13.1-p5) by adding the mongo user to the wheel group.  Perhaps a minor/subtle permissions issue?  I don't have much I can offer in the way of logs since it's been 2-3 weeks since I set this up, though I'd be happy to provide anything I can to further pinpoint if needed.

Since that time, it's been up and running without issue.  My mongo instance isn't under heavy load (I use it for Graylog metadata, but it's been stable and no issues since I made that change).

Comment by Pawel Kraszewski [ 24/Jan/23 ]

As to "needs verification" tag to the problem: mongodb60 has been removed from FreeBSD binary repositories and the official source port has been marked as "IGNORE".

I can't imagine of a better verification than a takedown from the official maintainers...

Comment by Pawel Kraszewski [ 02/Jan/23 ]

I can confirm the problem on 2 different FreeBSD 13.1 installations (one totally fresh), with repository-default MongoDB 6.0.2, and a distribution-default configuration file.

I gathered truss log (FreeBSD's equivalent of strace), it is attached with the matching mongod.log - I hope this will shed some light on the problem.

  mongod.log truss.log

Generated at Thu Feb 08 06:19:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.