[SERVER-32773] mlock: Operation not permitted Created: 18/Jan/18 Updated: 13/Aug/19 Resolved: 25/Jan/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 3.6.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kai Brobeil | Assignee: | Spencer Jackson |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | Use mongodb 3.6.1 with the official arch package and try to 'createUser' anywhere. |
||||||||||||
| Participants: | |||||||||||||
| Description |
|
When using the command 'createUser', the Server crashes without applying the changes. Please see the log below
|
| Comments |
| Comment by James Harvey [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Would there be a problem with having mongoDB detect a mlock() failure, and gracefully continuing rather than crashing? Perhaps giving a message saying performance might be improved by making it allowed, like how mongoDB gives a message about numactl? I'm sure there are some customers who wouldn't want it using mlock() for the potential negative performance impact it can have in certain situations. The reason I'm asking is as the maintainer of the mongoDB package in the Arch Linux AUR repository. A lot of our users build packages in a clean chroot environment, using systemd-nspawn. Prior to 4.2.0, only 8 unit tests crashed due to mlock() so I was able to disable those when inside systemd-nspawn. So, I could allow the other unit tests, db tests, and integration tests to run. As of 4.2.0, it looks like mlock() usage has been greatly expanded to the point where it's no longer feasible to find and disable all of the tests crashing from mlock(). After finding and disabling 19 more unit tests, it's appearing likely to me there are now hundreds that would need to be disabled, not only unit tests but also db and integration tests. The only reasonable choice I have is to disable testing entirely when building inside systemd-nspawn, which I'd like to see fixed. I realize arguments to systemd-nspawn have been discussed here to allow mlock() usage, and although that could be a solution for users running through systemd-nspawn themselves, I think it's highly unlikely Arch would likely change its clean chroot build scripts (called devtools) to allow mlock() calls, when AFAIK this hasn't been ran into with any other packages. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Spencer Jackson [ 25/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
That's great, I'm so glad it worked! Thanks for filing this, you had a fun issue, and I learned a bunch about capabilities. I'll close out this ticket, so have a great day, and thanks for using MongoDB! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kai Brobeil [ 25/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
You're my hero! It works! Thank you for your swift and competent analysis. That's definitely another very good reason to use MongoDB for our team (not that I need another one) I don't see a button to close the issue, I'm guessing you'll do that? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Spencer Jackson [ 24/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
OK! I think that's it! Check the man page for systemd-nspawn: https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html There's a flag called --capability, I think you have to bring up your container passing it the CAP_IPC_LOCK capability. The idea is, some syscalls would affect stuff outside the container, so somehow nspawn prevents syscalls which require non-whitelisted capabilities from running. It would be sad if you ran some untrusted code in a container, and it changed your system time to 2038 then mlocked all of your RAM. There's also a flag called --system-call-filter, which looks like it prevents a bunch of named syscalls from getting run, for pretty much the same reason. A glance at the nspawn source, suggests that it automatically unblacklists mlock if the CAP_IPC_LOCK capability is passed to --capability. So, that's why you're getting "Operation not permitted", even though the soft RLIMIT_MEMLOCK was non-zero: the syscall was just banned. And when you directly set the capability on the binary, the capability itself was banned, so the binary couldn't run. In short, if you can get nspawn to accept something like --capability=CAP_IPC_LOCK, I think it should work. Your syntax may vary, I've never really used nspawn before. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kai Brobeil [ 24/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Oh, now that you ask, we're running in an systemd nspawn-container vault does not work either /usr/bin/ping does work maybe after all this is related to nspawn? I wish I understood this more deeply... | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Spencer Jackson [ 24/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
That's... weird. When you get "bash: /usr/bin/mongod: Operation not permitted", does anything appear in dmesg, or in journalctl -ef? Are you using Docker, or any other containerization solution, by any chance? I found a few occurrences of that message in the context of containers. There appears to be a package called "vault" in the Community repository. If you install it, can the vault binary run? Its PKGBUILD gives it the cap_ipc_lock+ep capability too. Does /usr/bin/ping work? It uses a different capability. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kai Brobeil [ 24/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Spencer Jackson, thanks for the explanation. uname -a: Linux <hostname> 4.14.3-1-ARCH #1 SMP PREEMPT Thu Nov 30 18:33:13 UTC 2017 x86_64 GNU/Linux Hope this helps | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Spencer Jackson [ 23/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
yorrd, your error is unexpected. I'm on Arch Linux too, and cannot reproduce this with the mongodb-3.6.1-2 package. "Operation not permitted" means that mlock received an EPERM, which on modern kernels gets emitted when the process doesn't have CAP_IPC_LOCK, and the soft RLIMIT_MEMLOCK limit is 0. If the soft limit is too small, but not 0, I would expect you to get "Cannot allocate memory". Your limits look more than sufficient though. Can you show me the output of uname -a? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kai Brobeil [ 20/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
sure, here is the ulimit output. As per the other thread, I have increased the memlock to 2147483648 from 16777216, which should be sufficient I figure.
here are the complete logs.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kelsey Schubert [ 19/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi yorrd, Thanks for opening a new ticket. After reviewing the diagnostic.data and stack trace, I have a few more questions to help us with our investigation.
Thank you for your help, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kai Brobeil [ 18/Jan/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
diagnostics archive is being attached. |