[SERVER-2833] mongod.exe does not shut down correctly as a Windows service Created: 25/Mar/11  Updated: 12/Jul/16  Resolved: 09/Jun/12

Status: Closed
Project: Core Server
Component/s: Usability
Affects Version/s: 1.8.0
Fix Version/s: 2.0.7, 2.1.0

Type: Bug Priority: Major - P3
Reporter: Daniel Weber Assignee: Tad Marshall
Resolution: Done Votes: 5
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7 64-bit


Issue Links:
Duplicate
is duplicated by SERVER-6040 Stopping the mongod Windows Service R... Closed
Related
related to SERVER-1434 "Net stop mondodb" returns an error o... Closed
related to SERVER-2719 mongod running as a service on Window... Closed
Operating System: Windows
Participants:

 Description   

Rewritten description: When a MongoDB server is stopped by the Windows Service Control Manager (with "net stop mongodb" or during system shutdown or restart), mongo wasn't notifying the Service Control Manager of its steps to shutdown and was ignoring a direct shutdown notification, leading to an unclean shutdown.

Original description: I've created a service on my pc to start mongod on every start-up of the system.
When I try to connect to mongo from commandline by using the mongo command the result is the following error:
Error: couldn't connect to server 127.0.0.1 shell/mongo.js:79 exception: connect failed

We figured out that the problem is that mongod didn't start properly. When I have a look at the service list the status is always "starting" and didn't change. The service isn't able to start up properly.

When I go to the data folder where all my db data are stored and remove the mongod.lock file. The service is able to start and I can connect to mongoDB.
The pid which is inside the mongod.lock file isn't available when I have a look at the tast manager.

When I reboot my pc the problem is going to occur again. Removing the mongod.lock file resolves the problem.

Therefore I guess there is a problem that the service didn't shutdown and remove the mongod.lock file when you shutdown your pc.

We were able to reproduce the problem on different computers.



 Comments   
Comment by Tad Marshall [ 10/Jun/12 ]

Backported to 2.0 branch for 2.0.7.

Comment by auto [ 10/Jun/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-2833 – shut down cleanly on OS or Service Controller shutdown event

Added code to log shutdown events from the Windows Service Controller
and to respond correctly to those events. Shutdown from the console
on logoff or shutdown callbacks (stop ignoring them).
Branch: v2.0
https://github.com/mongodb/mongo/commit/e1504ed07ff7c4b0f1c96da32c36ecf607ac0229

Comment by Andy Schwerin [ 09/Jun/12 ]

Tad to backport for 2.0.7.

Comment by Tad Marshall [ 08/Jun/12 ]

Changed title and description to better describe the issue, rather than the symptoms.

Comment by auto [ 02/Dec/11 ]

Author:

{u'login': u'', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-2833 – shut down cleanly on OS or Service Controller shutdown event

Added code to log shutdown events from the Windows Service Controller
and to respond correctly to those events. Shutdown from the console
on logoff or shutdown callbacks (stop ignoring them).
Branch: master
https://github.com/mongodb/mongo/commit/c8e45cce25d4de4e877e8179f21d8c93166bc964

Comment by Tad Marshall [ 02/Dec/11 ]

After reworking the code, my best guess for what is happening is that on the customers' machines where this problem is seen, there are services running that take a little while to shut down. This introduces a race condition in which while Windows is waiting for the Service Controller to respond to the shutdown message, it starts to kill individual processes itself. Because we were ignoring the direct shutdown signal, we just get killed and don't shutdown cleanly. I (hopefully) fixed this by adding code to respond correctly to both the callback from the Service Controller and also to the direct message from the OS. I also added code to log the events from the Service Controller and do the shutdown on a named thread, so it should be clear from the log whether we got the message and how we responded to it.
I'm going to mark this bug as "fixed", but if anyone still sees it when running version 2.1.0 or later, please reopen the bug and provide as much detail as you can. Thanks!

Comment by Tad Marshall [ 01/Dec/11 ]

Like Robert Stam, I am unable to reproduce the "no clean shutdown of Windows service", but I don't like the messages that are generated when we shut down in this mode.
1) We hook the Windows shutdown message, but don't respond to it. "CTRL_SHUTDOWN_EVENT signal (ignored)". We should start our shutdown at this point and not wait for the Windows Service Controller to tell us something that we (should now) already know ... we are supposed to shut down now.
2) In my log file, we show that we are starting to shut down but don't show the reason. We should log the callback we get from the Service Control Manager to make it clear how the shutdown was triggered.
3) There are error messages in the shutdown stream that don't happen when we are running as a console app and get a ctrl-C. The problem is that we are being called on a new thread, but we don't do the thread initialialization that would allow us to shut down correctly. We need code like this:

Client::initThread( "Windows Service shutdown" );

None of this covers the fact that I can't reproduce the reported problem. But I can fix the problems that I can reproduce and we can see if that makes the original problem go away.

Comment by Tad Marshall [ 01/Dec/11 ]

@Mayur Ahir – please create a separate Jira ticket for your issues on Ubuntu. This Jira ticket is specific to problems with the Windows version when installed as a Windows Service. Thank you!

Comment by Mayur Ahir [ 06/Sep/11 ]

I am using v1.8.3 with Ubuntu 11.04 and still getting same error after reboot.
Here is the error log,

$ mongo
MongoDB shell version: 1.8.3
connecting to: test
Tue Sep 6 18:37:38 Error: couldn't connect to server 127.0.0.1 shell/mongo.js:79
exception: connect failed

Comment by Michael Catanzariti [ 21/Jun/11 ]

net start mongodb

=== BEGINING OF FILE

Tue Jun 21 09:48:09 [initandlisten] MongoDB starting : pid=6088 port=27017 dbpath=D:\mongoDB\data 64-bit
Tue Jun 21 09:48:09 [initandlisten] db version v1.8.2, pdfile version 4.5
Tue Jun 21 09:48:09 [initandlisten] git version: 433bbaa14aaba6860da15bd4de8edf600f56501b
Tue Jun 21 09:48:09 [initandlisten] build sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Tue Jun 21 09:48:09 [initandlisten] waiting for connections on port 27017
Tue Jun 21 09:48:09 [websvr] web admin interface listening on port 28017

=== END OF FILE

Comment by Michael Catanzariti [ 21/Jun/11 ]

net stop mongodb

=== BEGINING OF FILE

Tue Jun 21 09:47:14 shutdown: going to close listening sockets...
Tue Jun 21 09:47:14 closing listening socket: 292
Tue Jun 21 09:47:14 closing listening socket: 300
Tue Jun 21 09:47:14 shutdown: going to flush diaglog...
Tue Jun 21 09:47:14 shutdown: going to close sockets...
Tue Jun 21 09:47:14 shutdown: waiting for fs preallocator...
Tue Jun 21 09:47:14 shutdown: closing all files...
Tue Jun 21 09:47:14 closeAllFiles() finished
Tue Jun 21 09:47:14 shutdown: removing fs lock...
Tue Jun 21 09:47:14 [websvr] Listener: accept() returns -1 errno:10038 An operation was attempted on something that is not a socket.
Tue Jun 21 09:47:14 [initandlisten] Listener: accept() returns -1 errno:10038 An operation was attempted on something that is not a socket.
Tue Jun 21 09:47:14 [websvr] select() failure: ret=-1 errno:10038 An operation was attempted on something that is not a socket.
Tue Jun 21 09:47:14 [initandlisten] select() failure: ret=-1 errno:10038 An operation was attempted on something that is not a socket.
Tue Jun 21 09:47:14 [initandlisten] now exiting
Tue Jun 21 09:47:14 dbexit:
Tue Jun 21 09:47:14 [initandlisten] shutdown: going to close listening sockets...
Tue Jun 21 09:47:14 [initandlisten] shutdown: going to flush diaglog...
Tue Jun 21 09:47:14 [initandlisten] shutdown: going to close sockets...
Tue Jun 21 09:47:14 [initandlisten] shutdown: waiting for fs preallocator...
Tue Jun 21 09:47:14 [initandlisten] shutdown: closing all files...
Tue Jun 21 09:47:14 closeAllFiles() finished
Tue Jun 21 09:47:14 [initandlisten] shutdown: removing fs lock...
Tue Jun 21 09:47:14 [initandlisten] couldn't remove fs lock 0
Tue Jun 21 09:47:14 dbexit: really exiting now

=== END OF FILE

Comment by Michael Catanzariti [ 21/Jun/11 ]

There's nothing noticeable at the time of the shutdown...

Comment by Robert Stam [ 20/Jun/11 ]

It's a long shot, but is there anything in the Windows event viewer at the time of the shutdown?

Comment by Robert Stam [ 20/Jun/11 ]

Does "net stop mongodb" from the command prompt work? What about "net start mongodb" after that?

Can you post the log file for that to compare against?

Comment by Michael Catanzariti [ 20/Jun/11 ]

Hi Robert,

Yes, all the machine runs Windows 7 Enterprise with MongoDB 1.8.2

Comment by Robert Stam [ 20/Jun/11 ]

I realize you have already given me your log file. I was just adding my log file for comparison.

I tried to follow your steps as closely as possible and still was unable to reproduce.

The only thing I see that is different is that you are using Windows 7 Enterprise and I am using Windows 7 Ultimate. Also, this time I tested using MongoDB 1.8.2, but last time I was unable to reproduce I was using MongoDB 1.8.1, so I don't think anything changed between these two versions.

Are all the machines you are able to reproduce this on running Windows 7 Enterprise?

Comment by Michael Catanzariti [ 20/Jun/11 ]

Hi Robert,

I just installed version 1.8.2 and still reproduce the problem...

=== BEGINING OF FILE

Mon Jun 20 18:08:04 CTRL_SHUTDOWN_EVENT signal (ignored)
Mon Jun 20 18:10:04 [initandlisten] MongoDB starting : pid=2060 port=27017 dbpath=D:\mongoDB\data 64-bit
Mon Jun 20 18:10:04 [initandlisten] db version v1.8.2, pdfile version 4.5
Mon Jun 20 18:10:04 [initandlisten] git version: 433bbaa14aaba6860da15bd4de8edf600f56501b
Mon Jun 20 18:10:04 [initandlisten] build sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Mon Jun 20 18:10:04 [initandlisten] exception in initAndListen std::exception: old lock file, terminating
Mon Jun 20 18:10:04 dbexit:
Mon Jun 20 18:10:04 [initandlisten] shutdown: going to close listening sockets...
Mon Jun 20 18:10:04 [initandlisten] shutdown: going to flush diaglog...
Mon Jun 20 18:10:04 [initandlisten] shutdown: going to close sockets...
Mon Jun 20 18:10:04 [initandlisten] shutdown: waiting for fs preallocator...
Mon Jun 20 18:10:04 [initandlisten] shutdown: closing all files...
Mon Jun 20 18:10:04 closeAllFiles() finished
Mon Jun 20 18:10:04 dbexit: really exiting now

=== END OF FILE

Comment by Michael Catanzariti [ 20/Jun/11 ]

Robert,

I already gave you my config and the mongodb log about this problem.
I can add that we can reproduce it on different machines.

How can I help you to reproduce the problem ?

Comment by Robert Stam [ 20/Jun/11 ]

I am unable to reproduce this. Just tried again with the latest 1.8.2 release. Here's the content of my log file after

1. installing MongoDB as a service
2. starting the service
3. shutdown and reboot

The last few lines of the log file show MongoDB starting up cleanly after the reboot.

=====log file=====
C:\data\db2>type log
Mon Jun 20 11:00:39 dbexit:
Mon Jun 20 11:00:39 shutdown: going to close listening sockets...
Mon Jun 20 11:00:39 shutdown: going to flush diaglog...
Mon Jun 20 11:00:39 shutdown: going to close sockets...
Mon Jun 20 11:00:39 shutdown: waiting for fs preallocator...
Mon Jun 20 11:00:39 shutdown: closing all files...
Mon Jun 20 11:00:39 closeAllFiles() finished
Mon Jun 20 11:00:39 dbexit: really exiting now
Mon Jun 20 11:00:58 [initandlisten] MongoDB starting : pid=2972 port=27017 dbpat
h=c:\data\db2 64-bit
Mon Jun 20 11:00:58 [initandlisten] db version v1.8.2, pdfile version 4.5
Mon Jun 20 11:00:58 [initandlisten] git version: 433bbaa14aaba6860da15bd4de8edf6
00f56501b
Mon Jun 20 11:00:58 [initandlisten] build sys info: windows (6, 1, 7600, 2, '')
BOOST_LIB_VERSION=1_42
Mon Jun 20 11:00:58 [initandlisten] waiting for connections on port 27017
Mon Jun 20 11:00:58 [websvr] web admin interface listening on port 28017
Mon Jun 20 11:08:09 CTRL_SHUTDOWN_EVENT signal (ignored)
Mon Jun 20 11:08:10 shutdown: going to close listening sockets...
Mon Jun 20 11:08:10 closing listening socket: 268
Mon Jun 20 11:08:10 closing listening socket: 276
Mon Jun 20 11:08:10 [initandlisten] Listener: accept() returns -1 errno:10038 An
operation was attempted on something that is not a socket.
Mon Jun 20 11:08:10 [initandlisten] select() failure: ret=-1 errno:10038 An oper
ation was attempted on something that is not a socket.
Mon Jun 20 11:08:10 [initandlisten] now exiting
Mon Jun 20 11:08:10 dbexit:
Mon Jun 20 11:08:10 [initandlisten] shutdown: going to close listening sockets..
.
Mon Jun 20 11:08:10 [initandlisten] shutdown: going to flush diaglog...
Mon Jun 20 11:08:10 shutdown: going to flush diaglog...
Mon Jun 20 11:08:10 shutdown: going to close sockets...
Mon Jun 20 11:08:10 [initandlisten] shutdown: going to close sockets...
Mon Jun 20 11:08:10 shutdown: waiting for fs preallocator...
Mon Jun 20 11:08:10 shutdown: closing all files...
Mon Jun 20 11:08:10 closeAllFiles() finished
Mon Jun 20 11:08:10 [initandlisten] shutdown: waiting for fs preallocator...
Mon Jun 20 11:08:10 [initandlisten] shutdown: closing all files...
Mon Jun 20 11:08:10 closeAllFiles() finished
Mon Jun 20 11:08:10 shutdown: removing fs lock...
Mon Jun 20 11:08:10 [initandlisten] shutdown: removing fs lock...
Mon Jun 20 11:08:10 [initandlisten] couldn't remove fs lock 0
Mon Jun 20 11:08:10 dbexit: really exiting now
Mon Jun 20 11:13:42 [initandlisten] MongoDB starting : pid=1624 port=27017 dbpat
h=c:\data\db2 64-bit
Mon Jun 20 11:13:42 [initandlisten] db version v1.8.2, pdfile version 4.5
Mon Jun 20 11:13:42 [initandlisten] git version: 433bbaa14aaba6860da15bd4de8edf6
00f56501b
Mon Jun 20 11:13:42 [initandlisten] build sys info: windows (6, 1, 7600, 2, '')
BOOST_LIB_VERSION=1_42
Mon Jun 20 11:13:42 [initandlisten] waiting for connections on port 27017
Mon Jun 20 11:13:42 [websvr] web admin interface listening on port 28017

C:\data\db2>
=====end of log file=====

Comment by Michael Catanzariti [ 20/Jun/11 ]

Hi Robert,

Is there any progress about this bug ?

Comment by Robert Stam [ 10/May/11 ]

Thanks for the log.

That's definitely different than my log. After CTRL_SHUTDOWN_EVENT I have another 20 or lines culminating with a clean exit. We'll have to figure out what's different about your environment than mine. One difference is that I'm using Windows 7 Ultimate instead of Windows 7 Enterprise, but that shouldn't make any difference.

Comment by Michael Catanzariti [ 09/May/11 ]

I just installed a fresh version of mongo with no data, started it and shutowned windows 7. So there is no load & no data at all.

This is my hardware config :
Intel Xeon W3530 2.8GHz (8 cores)
12.0 GB

This is the "Path to executable" windows service setting : "C:\Program Files\MongoDB\bin\mongod" --dbpath "D:\mongoDB\data" --logpath "D:\mongoDB\logs\log.txt" --logappend --service

This is the content of my log.txt file

=== BEGINING OF FILE
Mon May 09 15:27:02 [initandlisten] MongoDB starting : pid=4196 port=27017 dbpath=D:\mongoDB\data 64-bit
Mon May 09 15:27:02 [initandlisten] db version v1.8.1, pdfile version 4.5
Mon May 09 15:27:02 [initandlisten] git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04
Mon May 09 15:27:02 [initandlisten] build sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Mon May 09 15:27:02 [initandlisten] waiting for connections on port 27017
Mon May 09 15:27:02 [websvr] web admin interface listening on port 28017
Mon May 09 15:27:25 CTRL_SHUTDOWN_EVENT signal (ignored)
Mon May 09 15:29:56 [initandlisten] MongoDB starting : pid=1824 port=27017 dbpath=D:\mongoDB\data 64-bit
Mon May 09 15:29:56 [initandlisten] db version v1.8.1, pdfile version 4.5
Mon May 09 15:29:56 [initandlisten] git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04
Mon May 09 15:29:56 [initandlisten] build sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Mon May 09 15:29:56 [initandlisten] exception in initAndListen std::exception: old lock file, terminating
Mon May 09 15:29:56 dbexit:
Mon May 09 15:29:56 [initandlisten] shutdown: going to close listening sockets...
Mon May 09 15:29:56 [initandlisten] shutdown: going to flush diaglog...
Mon May 09 15:29:56 [initandlisten] shutdown: going to close sockets...
Mon May 09 15:29:56 [initandlisten] shutdown: waiting for fs preallocator...
Mon May 09 15:29:56 [initandlisten] shutdown: closing all files...
Mon May 09 15:29:56 closeAllFiles() finished
Mon May 09 15:29:56 dbexit: really exiting now
=== END OF FILE

Comment by Robert Stam [ 09/May/11 ]

Reopening because user reports they can reliably reproduce the issue. Am waiting for log files.

Comment by Robert Stam [ 09/May/11 ]

Something must be different on our systems then. Do you have any log files we could look at? One problem with the infinite restarts when booting is that it creates very large log files, so the first step I would recommend is manually configuring the MongoDB service to only restart on failure the first or second time. And then perhaps start with a new empty log file and reproduce.

Windows only allows a process about 30 seconds to shutdown, so it's possible that something is preventing mongod from shutting down within 30 seconds and is then being killed by Windows for taking too long. Is your mongod under heavy load when you reboot the system? Are there any long running operations that take more than 30 seconds?

Comment by Michael Catanzariti [ 09/May/11 ]

Sorry, but on several workstations of my company, we use mongodb 1.8.1 on Windows 7 Enterprise 64 bit.
And after each reboot of the machine, MongoDB is not shutting down cleanly and leave behind a mongod.lock file.
After the mongod.lock file deletion, the service starts with no problem.
The problem is reproduced systematically

Comment by Robert Stam [ 09/May/11 ]

Works as designed. See previous comment.

Comment by Robert Stam [ 09/May/11 ]

I just retested this using MongoDB 1.8.1 on a Windows 7 64-bit system, and it worked as designed.

Just to clarify, this JIRA ticket had to do with MongoDB not shutting down cleanly when Windows was shutdown. The result of a Windows shutdown prior to this fix was that a mongod.lock file was left behind. The presence of a mongod.lock file is evidence that mongod did not shut down cleanly and that a repair should be attempted before restarting mongod.

I mongod crashes for any other reason it will also leave a mongod.lock file behind, but that is unrelated to this JIRA ticket.

Deleting a mongod.lock file without running repair is tempting fate a bit.

I have created a new JIRA ticket regarding Windows service startup options related to mongod failing to start and how many attempts Windows will automatically make to restart the service (as opposed to this JIRA ticket which is related to a clean shutdown). See:

https://jira.mongodb.org/browse/SERVER-3063

Comment by Frank Henard [ 09/May/11 ]

Likewise, as you see in my comment above. I am using 1.8.1 with Ubuntu Server 11.04, and mongo is not stopping at shutdown.

Comment by Michael Catanzariti [ 09/May/11 ]

The fix does not seem to work on Windows 7, with the 1.8.1 release.
A mongod.lock still exists on my disk after reboot and prevents the windows service to start.
Again, removing the mongod.lock file resolves the problem.

Comment by Robert Stam [ 05/May/11 ]

The fix for this didn't make in to the 1.8.0 release. It is in the 1.8.1 release.

Comment by Robert Stam [ 05/May/11 ]

Fixed in 1.8.1 (not 1.8.0).

Comment by Frank Henard [ 05/May/11 ]

I'm also having this problem - Ubuntu Server 11.04, Mongo 1.8.1

EDIT 6/20/11 - It turns out that I was not shutting down the server correctly. Improperly shutting down the server did not impact apache, but was a problem for mongo because it would not start on the server restart.

Generated at Thu Feb 08 03:01:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.