[SERVER-7254] Mongo init.d scripts not working on ubuntu Created: 04/Oct/12  Updated: 06/Apr/23  Resolved: 07/Mar/14

Status: Closed
Project: Core Server
Component/s: Packaging
Affects Version/s: 2.0.7, 2.4.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jay Perry Assignee: Ernie Hershey
Resolution: Done Votes: 4
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

ubuntu 11.10


Issue Links:
Related
is related to SERVER-8774 mongodb is uninstallable on Debian Sq... Closed
is related to DOCS-2869 Clarify fork option Closed
Backwards Compatibility: Fully Compatible
Operating System: Linux
Participants:

 Description   

Hi,

I installed mongodb20-10gen and it installs fine and starts the service but the problem I am seeing is after it is installed the init.d/mongodb service script seems some what broken. The status script returns a fail even though the service is running. After investigating it seems something is storing the wrong pid in the pidfile causing the status to not work correctly. This also breaks the "stop" command. Updating the pidfile with the correct pid after doing a "ps aux" fixes things but this is not an ideal solution. It may be because something is being forked that the pid is being stored at the wrong time. Let me know if you need any other information. Please investigate.

Thanks,
Jay



 Comments   
Comment by Ernie Hershey [ 07/Mar/14 ]

Thanks! The default is fork=false. We should make that more clear in the docs. I opened DOCS-2869 to address that.

If you require fork=true in upstart, "expect daemon" will work.

In debian or with sysvinit, using a configuration like Brandon posted should work, in which mongod forks and creates its own pidfile instead of start-stop-daemon creating the pidfile.

The init script and configuration in our .deb packages will also work, in which mongod doesn't fork itself and start-stop-daemon is able to track the pidfile itself.

Comment by Brandon Checketts [ 07/Mar/14 ]

The actual code for it is in mongo/src/mongo/db/initialize_server_global_state.cpp (https://github.com/mongodb/mongo/blob/5ea897e30b45447d55289e33f636da3017b1e8db/src/mongo/db/initialize_server_global_state.cpp)

The forkServer() function begins at line #92 and actually forks twice. Which I think is why Upstart is unable to track the PID to the eventual child process since you mentioned that upstart only tracks the first PID respawned.

Somebody familiar with the actual mongo code would need to explain why it needs to fork twice. After the first fork it calls setsid() and chdir, and after the second fork it assigns stdin, stdout, and stderr to /dev/null.

Comment by Scott Lowe [ 07/Mar/14 ]

@Ernie Hershey - The docs make this sound like the default setting:

<quote>
fork is true, which enables a daemon mode for mongod, which detaches (i.e. “forks”) the MongoDB from the current session and allows you to run the database as a conventional server.
</quote>

Comment by Ernie Hershey [ 07/Mar/14 ]

bchecketts, daniel.taschik@signavio.com, sosh - do you need fork=true for some reason? Our Debian init script and Ubuntu upstart configurations assume that the mongod daemon will not be forking when started by them.

In Upstart -
"If you do not specify the expect stanza, Upstart will track the life cycle of the first PID that it executes in the exec or script stanzas. "
http://upstart.ubuntu.com/cookbook/#expect

In Debian with sysvinit, we pass --background and --make-pidfile to start-stop-daemon in the init script, indicating that it should manage the pid file and expect the child program (mongod) not to fork on its own.
http://www.unix.com/man-page/Linux/8/start-stop-daemon/

Comment by Brandon Checketts [ 04/Feb/14 ]

This is what I have that is working. It has been a while since I installed these (package is mongodb-10gen v2.4.3)

init file (/etc/init.d/mongod_data_d)
http://pastebin.com/pqSsqvTF

Config File: (/etc/mongod_data_d.conf)
http://pastebin.com/7di4SVmC

Comment by Daniel Taschik [ 04/Feb/14 ]

added the path but it's still the same.

root@mongodb03:/var/run# /etc/init.d/mongodb start
[FAIL] Starting database: mongodb failed!
 
root@mongodb03:/var/run#  ps aux|grep mongod
mongodb  19936  0.8  3.3 2778156 34208 ?       Sl   16:41   0:00 /usr/bin/mongod --config /etc/mongodb.conf

The config looks now like this:

root@mongodb03:/var/run# cat /etc/mongodb.conf
# mongo.conf - generated from Puppet
 
#where to log
logpath=/var/log/mongodb/mongodb.log
logappend=true
 
# fork and run in background
fork = true
port = 27017
dbpath= /var/lib/mongodb
# Disable the HTTP interface (Defaults to localhost:27018).
nohttpinterface = true
# Configure ReplicaSet membership
replSet = eff0
# Manually added requirement
setParameter=textSearchEnabled=true
pidfilepath=/var/run/mongod.pid

There is no mongos.pid in /var/run/!

Comment by Brandon Checketts [ 04/Feb/14 ]

In my working instances, I have a pidfilepath parameter similar to this in the config file:

pidfilepath=/var/lock/mongod_data_c.pid

you could try adding that, then make sure that everything is stopped, and restart (with the --make-pidfile option still removed from the init file)

Comment by Daniel Taschik [ 04/Feb/14 ]

# mongo.conf - generated from Puppet
 
#where to log
logpath=/var/log/mongodb/mongodb.log
logappend=true
 
# fork and run in background
fork = true
port = 27017
dbpath= /var/lib/mongodb
# Disable the HTTP interface (Defaults to localhost:27018).
nohttpinterface = true
# Configure ReplicaSet membership
replSet = eff0
# Manually added requirement
setParameter=textSearchEnabled=true

Comment by Brandon Checketts [ 04/Feb/14 ]

Daniel, can you post the contents of your mongo config file in /etc/mongodb.conf

Comment by Daniel Taschik [ 04/Feb/14 ]

Did exactly as you were saying but it's not working. Mongo get's started although the init script returns a failure.

 root@mongodb03:/var/run# /etc/init.d/mongodb start
[FAIL] Starting database: mongodb failed! 

But the process is running:

mongodb  18922  1.0  3.3 2786352 34164 ?       Sl   16:20   0:00 /usr/bin/mongod --config /etc/mongodb.conf

I can not see a mongodb pid file in /var/run/ which was there before.

Comment by Brandon Checketts [ 04/Feb/14 ]

Anybody still having this problem, try this:

  • Edit /etc/init.d/mongo* (whatever your exact init file is)
  • On about line 122, inside the start_server() function, take out the "--make-pidfile" argument
  • Make sure that all mongo instances are stopped
  • Remove mongo pidfile in /var/lock/ (or wherever your config file is placing them)
  • Try stopping/starting mongo, and report back here if that works.
Comment by Daniel Taschik [ 04/Feb/14 ]

The problem is also present in Debian Wheezy.

Comment by Scott Lowe [ 04/Feb/14 ]

@Ernie - sorry for the delay. And yes, I'm afraid I'm still having the issue it seems (on Debian Squeeze with fork=true enabled). Like Daniel, ps -A is reporting a different pid number to the one that is stored in /var/run/mongodb.pid

Comment by Daniel Taschik [ 23/Nov/13 ]

The issue is still not working for mongodb version 2.4.8. There is still a different PID in /var/run/mongodb then the process is running under:

ps output:
mongodb 2174 0.6 3.3 2779148 34436 ? Sl 14:30 0:02 /usr/bin/mongod --config /etc/mongodb.conf

cat /var/run/mongodb.pid
2265

Would be great if someone could follow up on this.

Comment by Scott Lowe [ 16/Jul/13 ]

@Ernie - Thanks for the response. I'll post some more details later (probably tomorrow) after I try to understand what's going on a little better.

Comment by Ernie Hershey [ 15/Jul/13 ]

I'm sorry there's no update yet. sosh can you help me understand the problem you're still having? I thought you were okay after turning off fork=true?

Comment by Scott Lowe [ 08/Jul/13 ]

Hi, Any update with this issue? My package manager updates for mongodb are now failing, which I think is related to this. It's becoming rather worrying in production.

Comment by Brandon Checketts [ 29/Jun/13 ]

It has been a while now since I looked at this. I'm running several instances on several machines. And it looks like the instances where I've made this change to the init file were for both configsvr, and regular mongod instances. The mongod instances do have fork=true in the config files.

Comment by Ernie Hershey [ 29/Jun/13 ]

bchecketts - were you running with fork=true either in your mongod command line or config file?

Comment by Brandon Checketts [ 20/May/13 ]

Exact same issue here on Ubuntu 12.04.2.

I've resolved it by removing the --make-pidfile option in the start_server() function in the init file.

My understanding of this process is that the start-stop-daemon command was creating the pidfile (as root), before spawning the actual mongod process (as the mongodb user). mongod in some cases (at least when not configsvr=true) must fork again before saving its own pidfile. Since the file created by start-stop-daemon is being run as root, the less-privileged mongodb user can not overwrite it (perhaps this should be logged, or logged at a less verbose level?), leaving the pidfile containing a pid that is no longer correct.

On my machine, the pidfile created with the --make-pidfile options was consistently exactly three less than the PID shown in the output of 'ps'

After making that change to the init file, I can now reliably start/stop the mongod process using the expected commands.

(FYI, you may need to manually remove the pidfiles since they were created as root)

Comment by Ernie Hershey [ 26/Apr/13 ]

I replicated the issue without configsvr=true. There definitely could be another issue at play here but I believe if you're controlling mongod via init scripts, you shouldn't need fork=true, running as a config server or as a normal instance. I'll test some more to be sure.

Comment by Scott Lowe [ 24/Apr/13 ]

Sure. If I remove the fork=true then it behaves normally. However, the strange thing is fork=true runs fine on my replica set servers (without configsvr=true). It's only the config servers with configsvr=true that are displaying this problem. (Not sure if you replicated the problem with or without configsvr=true). Everything else about the servers is identical as far as I can tell (same OS ver, mongo ver, CPU etc), unless there's some difference at the virtualization level not visible to me.

So do you mean then that I don't need (or even shouldn't have) fork=true in configs for instances that are started by the init script?

Thanks

Comment by Ernie Hershey [ 24/Apr/13 ]

That is very helpful, thank you.

If you can easily, can you run without fork = true and see if that fixes it and still works as you expect it to? I think the init script takes care of backgrounding the process, so it shouldn't need to fork itself as well. I played with this a bit and was able to see the behavior you describe with fork = true in /etc/mongodb.conf and if I take that out everything works fine as far as I can tell.

Comment by Scott Lowe [ 24/Apr/13 ]

@Ernie Hershey: In case it's useful: According to `dmesg | grep -i numa` NUMA is turned off. There is no process running that the PID file refers to (and the running process id is higher than the one in the PID file).

Comment by Scott Lowe [ 24/Apr/13 ]

I appear to have the same issue. However, I'm on Debian 6, and Mongo 2.4, and am seeing this with mongod configured as a config server. The background is here: http://serverfault.com/questions/501700/strange-behaviour-starting-process-with-init-script In a nutshell, the init.d/mongodb script displays a 'Starting database: mongodb failed!' message when starting, however the process seems to be running ok, and there is nothing bad in the mongo logs. Subsequently the init script is not usable (for stop/restart etc) as it seems the mongod process id from ps is different to that stored in the PID file. I'm also have fork=true in case that's relevant. Host is a 64 bit Xen VPS instance at linode running on a Xeon E5-2670.

Comment by Ernie Hershey [ 18/Apr/13 ]

Hi Jay!

Can you help me diagnose this by letting me know if you have numactl installed? Is there a running process that the pid actually does refer to?

Thanks
Ernie

Comment by Jay Perry [ 09/Oct/12 ]

Any status on this fix? Does this exist in 2.2.x as well?

Generated at Thu Feb 08 03:14:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.