[SERVER-10623] Mongod fails to start as unprivileged user when {{/sys/devices/system/node}} is not readable and executable. Created: 26/Aug/13  Updated: 11/Jul/16  Resolved: 27/Sep/13

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 2.4.6, 2.5.2
Fix Version/s: 2.5.3

Type: Bug Priority: Major - P3
Reporter: Hugues Lismonde Assignee: Kyle Erf
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

ArchLinux, kernel 3.10.9-xxxx-grs-ipv6-64 (OVH) x86_64, Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz


Issue Links:
Related
related to SERVER-12464 Mongod fails to start as unprivileged... Closed
is related to SERVER-18763 Use get_mempolicy() or /proc/zoneinfo... Backlog
Backwards Compatibility: Fully Compatible
Operating System: Linux
Participants:

 Description   

The function boost::filesystem::exists(boost::filesystem::path) throws an exception if some component of the path is not accessible by the executing user. This can cause the ProcessInfo::checkNumaEnabled() to throw an uncaught exception at start-up, when trying to determine if NUMA interleaving is enabled or disabled. Possible resolutions are to detect this in a different way that cannot fail, or to recover more gracefully when this fails.

Original report follows.

Hi,

On a brand new dedicated server from OVH, under ArchLinux, mongod fails to start as mongodb user. Starting with the same config as root works flawlessly.

This might be related to the kernel version/options set by OVH as I've found a Gentoo user with the exact same issue. The common trait would be the custom 3.10.9 kernel (http://forums.gentoo.org/viewtopic-p-7382216.html).

Below is the syslog for a failed startup:

Aug 26 17:42:14 lead systemd[1]: Started High-performance, schema-free document-oriented database.
Aug 26 17:42:14 lead mongod[4279]: Mon Aug 26 17:42:14.211 terminate() called, printing stack (if implemented for platform):
Aug 26 17:42:14 lead mongod[4279]: 0xadb736 0x6e3355 0x31af3645c46 0x31af3645c73 0x31af3645e9e 0x31af41de41f 0xad02ed 0xad0b02 0xacf9d5 0xacfa69 0x704d6c 0x70508a 0x705633 0x705659 0x7059f7 0x6cbf99 0x31af2d45bc5 0x6e31f5
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xadb736]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11myterminateEv+0x45) [0x6e3355]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ec46) [0x31af3645c46]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ec73) [0x31af3645c73]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ee9e) [0x31af3645e9e]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libboost_filesystem.so.1.54.0(_ZN5boost10filesystem6detail6statusERKNS0_4pathEPNS_6system10error_codeE+0x1ff) [0x31af41de41f]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo16checkNumaEnabledEv+0x2d) [0xad02ed]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo10SystemInfo17collectSystemInfoEv+0x292) [0xad0b02]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo20initializeSystemInfoEv+0x85) [0xacf9d5]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo36_mongoInitializerFunction_SystemInfoEPNS_18InitializerContextE+0x9) [0xacfa69]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5boost6detail8function17function_invoker1IPFN5mongo6StatusEPNS3_18InitializerContextEES4_S6_E6invokeERNS1_15function_bufferES6_+0xc) [0x704d6c]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZNK5mongo11Initializer7executeERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x1aa) [0x70508a]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo21runGlobalInitializersERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x23) [0x705633]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo26runGlobalInitializersOrDieERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x19) [0x705659]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo26runGlobalInitializersOrDieEiPKPKcS3_+0x307) [0x7059f7]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(main+0x259) [0x6cbf99]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libc.so.6(__libc_start_main+0xf5) [0x31af2d45bc5]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod() [0x6e31f5]
Aug 26 17:42:14 lead mongod[4279]: Mon Aug 26 17:42:14.217 Got signal: 6 (Aborted).
Aug 26 17:42:14 lead mongod[4279]: Mon Aug 26 17:42:14.221 Backtrace:
Aug 26 17:42:14 lead mongod[4279]: 0xadb736 0x6e381b 0x31af2d59450 0x31af2d593d9 0x31af2d5a7d8 0x6e335a 0x31af3645c46 0x31af3645c73 0x31af3645e9e 0x31af41de41f 0xad02ed 0xad0b02 0xacf9d5 0xacfa69 0x704d6c 0x70508a 0x705633 0x705659 0x7059f7 0x6cbf99
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xadb736]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0xfb) [0x6e381b]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libc.so.6(+0x35450) [0x31af2d59450]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libc.so.6(gsignal+0x39) [0x31af2d593d9]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libc.so.6(abort+0x148) [0x31af2d5a7d8]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11myterminateEv+0x4a) [0x6e335a]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ec46) [0x31af3645c46]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ec73) [0x31af3645c73]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libstdc++.so.6(+0x5ee9e) [0x31af3645e9e]
Aug 26 17:42:14 lead mongod[4279]: /usr/lib/libboost_filesystem.so.1.54.0(_ZN5boost10filesystem6detail6statusERKNS0_4pathEPNS_6system10error_codeE+0x1ff) [0x31af41de41f]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo16checkNumaEnabledEv+0x2d) [0xad02ed]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo10SystemInfo17collectSystemInfoEv+0x292) [0xad0b02]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo11ProcessInfo20initializeSystemInfoEv+0x85) [0xacf9d5]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo36_mongoInitializerFunction_SystemInfoEPNS_18InitializerContextE+0x9) [0xacfa69]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5boost6detail8function17function_invoker1IPFN5mongo6StatusEPNS3_18InitializerContextEES4_S6_E6invokeERNS1_15function_bufferES6_+0xc) [0x704d6c]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZNK5mongo11Initializer7executeERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x1aa) [0x70508a]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo21runGlobalInitializersERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x23) [0x705633]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo26runGlobalInitializersOrDieERKSt6vectorISsSaISsEERKSt3mapISsSsSt4lessISsESaISt4pairIKSsSsEEE+0x19) [0x705659]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(_ZN5mongo26runGlobalInitializersOrDieEiPKPKcS3_+0x307) [0x7059f7]
Aug 26 17:42:14 lead mongod[4279]: /usr/bin/mongod(main+0x259) [0x6cbf99]
Aug 26 17:42:14 lead systemd[1]: mongodb.service: main process exited, code=exited, status=14/n/a
Aug 26 17:42:14 lead systemd[1]: Unit mongodb.service entered failed state.



 Comments   
Comment by Matt Kangas [ 27/Jan/14 ]

See follow-on ticket SERVER-12464

Comment by Daniel Pasette (Inactive) [ 26/Jan/14 ]

kyle.erf, as pointed out by statianzo, it looks like there is still an issue reading /sys/devices/system/node/node1:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/startup_warnings.cpp#L90

Comment by Jason Staten [ 23/Dec/13 ]

I'm still having a related issue to this in 2.5.4. Also running on an OVH kernel that /sys/devices/system/node is not readable/executable. On startup, the fatal message is logged:

exception in initAndListen std::exception: boost::filesystem::status:
Permission denied: "/sys/devices/system/node/node1", terminating

While the fix solved processinfo_linux2.cpp's reading of /sys/devices/system/node/node1, there's another attempt to read it inside of startup_warnings.cpp that isn't wrapped with a try/catch.

Comment by auto [ 27/Sep/13 ]

Author:

{u'username': u'3rf', u'name': u'Kyle Erf', u'email': u'erf@mongodb.com'}

Message: SERVER-10623 Added error handling to NUMA check

Signed-off-by: Matt Kangas <matt.kangas@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/40b90318d5e94fa2112650f72c371d6bd2e535c7

Comment by Hugues Lismonde [ 27/Aug/13 ]

Thanks for the workaround. It seems a lot of needed folders where unreadable under /sys. Mongod kept failing because it couldn't read /sys/dev/block/9:2/queue/read_ahead_kb. The error was this time much less "explosive":

Tue Aug 27 21:03:20.257 [initandlisten] exception in initAndListen std::exception: boost::filesystem::status: Permission denied: "/sys/dev/block/9:2/queue/read_ahead_kb", terminating

was in the mongodb.log.

Currently I've chosen the nuke approach and +rx'ed all folders under /sys. I'll have a look at the permissions on another server with the same mongodb version where it starts correctly.

Comment by Andy Schwerin [ 27/Aug/13 ]

The problem is definitely that /sys/devices/system/node is not +rx for the user in question. Better behavior would be for mongod to issue a warning that it cannot detect if NUMA interleaving is enabled or disabled in this case, or to detect this by a means that is guaranteed to be readable by non-root users, if one exists. I will update the description of this ticket accordingly.

As a workaround, is it reasonable to make said directory +rx in your deployment.

Comment by Hugues Lismonde [ 27/Aug/13 ]

700 also, system and above are world readable/browseable.

Comment by Andy Schwerin [ 26/Aug/13 ]

What are the permissions on /sys/devices/system/node?

Comment by Hugues Lismonde [ 26/Aug/13 ]

Hi,

mongod --version:
db version v2.4.6
Mon Aug 26 21:35:37.207 git version: nogitversion

ldd /usr/bin/mongod:
linux-vdso.so.1 (0x0000037e54e2f000)
libpcre.so.1 => /usr/lib/libpcre.so.1 (0x0000037e549a7000)
libpcrecpp.so.0 => /usr/lib/libpcrecpp.so.0 (0x0000037e5479e000)
libsnappy.so.1 => /usr/lib/libsnappy.so.1 (0x0000037e54598000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x0000037e5437a000)
libssl.so.1.0.0 => /usr/lib/libssl.so.1.0.0 (0x0000037e5410e000)
libcrypto.so.1.0.0 => /usr/lib/libcrypto.so.1.0.0 (0x0000037e53d04000)
libboost_thread.so.1.54.0 => /usr/lib/libboost_thread.so.1.54.0 (0x0000037e53aee000)
libboost_filesystem.so.1.54.0 => /usr/lib/libboost_filesystem.so.1.54.0 (0x0000037e538d8000)
libboost_program_options.so.1.54.0 => /usr/lib/libboost_program_options.so.1.54.0 (0x0000037e53669000)
libboost_system.so.1.54.0 => /usr/lib/libboost_system.so.1.54.0 (0x0000037e53466000)
librt.so.1 => /usr/lib/librt.so.1 (0x0000037e5325e000)
libtcmalloc.so.4 => /usr/lib/libtcmalloc.so.4 (0x0000037e52fee000)
libstdc+.so.6 => /usr/lib/libstdc+.so.6 (0x0000037e52cea000)
libm.so.6 => /usr/lib/libm.so.6 (0x0000037e529e7000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x0000037e527d1000)
libc.so.6 => /usr/lib/libc.so.6 (0x0000037e52427000)
/lib64/ld-linux-x86-64.so.2 (0x0000037e54c11000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x0000037e52223000)
libz.so.1 => /usr/lib/libz.so.1 (0x0000037e5200d000)

/proc/self/numa_maps is root:root, 444 so this one should be readable but I don't have a node1 and /sys/devices/system/node/node0 is root:root, 700

Comment by Andy Schwerin [ 26/Aug/13 ]

What build of mongod and what version are you using? The most likely culprit is that one of /sys/devices/system/node/node1 and /proc/self/numa_maps, or one of the parent directories thereof, are not readable (or executable in the case of directories) by the user launching MongoDB. I'm a touch surprised by this specific behavior, though. Are you running a version of mongodb downloaded from mongodb.org, or one from the Arch packages? If the latter, it would be good to know the results of running "ldd" on the binary, and also to know the version of boost installed in /usr. If you downloaded the package from mongodb.org, the result of running mongod --version would be helpful.

To confirm the hypothesis that it's an unreadable file, could you check the file permissions for the whole path to numa_maps and node1, described above?

Generated at Thu Feb 08 03:23:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.