[SERVER-2719] mongod running as a service on Windows does not restart correctly after a reboot Created: 09/Mar/11 Updated: 08/Feb/23 Resolved: 15/Mar/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 1.8.0-rc1 |
| Fix Version/s: | 1.8.1, 1.9.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Robert Stam | Assignee: | Tad Marshall |
| Resolution: | Done | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 7 64-bit with SP1 and all Windows Updates as of 2011-03-07. |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | Windows | ||||||||
| Participants: | |||||||||
| Description |
|
Steps to reproduce: 1. mongod --install --logpath c:\data\logs --logappend Upon reboot mongod repeatedly fails to start. If you watch its status using the Windows Services Administrative Tool you can see the status alternating between Starting and Started. Each time mongod attempts to start it writes a few lines to the log file, and the attempts to start are happening in such quick succession that the log file is growing at the rate of several MB/minute, so unless action is taken the disk can be filled up. To break out of the cycle repeatedly run: net stop MongoDB The problem is that this command only works when the service is in the Started state, which it is only very briefly before failing. So you have to keep running this command until you get lucky. The other piece of information is that upon reboot the mongod.lock file is not empty. It contains one line of text with a number in it. The two log files attached are the log file right after the service was started for the first time, and the first 20K of the ever growing log file after reboot. |
| Comments |
| Comment by Tom Robinson [ 22/Nov/13 ] |
| Comment by Tad Marshall [ 16/Jan/12 ] |
|
@Zeng Jie – you mention version 2.0.2, but your log says 1.8.2. Some of what I see in the log file matches behavior that was fixed for 2.1.0 (see https://jira.mongodb.org/browse/SERVER-2833 ) but some fixes are in earlier versions, including the 2.0.2 you mentioned. The fix for |
| Comment by Donny V [ 25/Aug/11 ] |
|
I'm still seeing this bug also when you stop the service. Doesn't seem to happen all the time though. Log file: |
| Comment by Justin Dearing [ 26/Apr/11 ] |
|
Why does mongo not attempt to remove the lock file? Doesn't mongod hold an exclusive lock on the the lock file? |
| Comment by Robert Stam [ 26/Apr/11 ] |
|
The fix that was made for this JIRA ticket is limited to ensuring that a clean shutdown of Windows results in a clean shutdown of MongoDB running as a service. An unclean shutdown of Windows or a crash of MongoDB will still leave the lock file in place resulting in this infinite loop on startup. The infinite loop can be broken either by running "net stop MongoDB" until it takes effect, or by changing the service properties for MongoDB in the Services control panel to not restart the service on failure. |
| Comment by Paul C [ 26/Apr/11 ] |
|
I'm still fighting the issue. My Mongo version is 1.8.1 |
| Comment by auto [ 16/Mar/11 ] |
|
Author: {u'login': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}Message: Fixed |
| Comment by auto [ 15/Mar/11 ] |
|
Author: {u'login': u'rstam', u'name': u'rstam', u'email': u'robert@10gen.com'}Message: Fixed |
| Comment by Testo [ 10/Mar/11 ] |
|
if you try the patch, it should work. It works for me after rebooting |
| Comment by Eliot Horowitz (Inactive) [ 09/Mar/11 ] |
|
This does sound like its the root cause. Can you try to fix? |
| Comment by Chris Westin [ 09/Mar/11 ] |
|
In between other things, I fooled with this a bit and looked at a bit of code. The service handler is calling db/instance.cpp/shutdownServer(). That function appears to do all the necessaries synchronously, and only after that does the service handler report success to its caller. However, there are some discrepancies in the apparent code paths that are taken during the shutdown process. I tried shutting down in various ways, including explicitly (via net stop MongoDB), and rebooting (where the service manager should do the same thing). I've captured some log fragments (below) which show the results aren't always the same. The worse offender is the log fragment from rebooting, which seems to indicate the system just got a ctrl-c and ignored it, rather than shutting down. More puzzling is that this is what I see, even though I can't reproduce the problem. I start up just fine. But Robert (and these external cases) don't. ---- closing down foreground mongod process with ^C: ---- net stop MongoDB accompanied by C:\Windows\system32>net stop MongoDB The pipe has been ended. C:\Windows\system32> ---- rebooting machine the machine ---- mongod --remove and actually removes the lockfile, doesn't just truncate it as other things do |
| Comment by Chris Westin [ 09/Mar/11 ] |
|
There's also this thread: http://thread.gmane.org/gmane.comp.db.mongodb.user/26312 |
| Comment by Justin Dearing [ 09/Mar/11 ] |
|
Sorry I see it |
| Comment by Justin Dearing [ 09/Mar/11 ] |
|
Testo, did you mean to say you added a patch? If so where can I see it? |
| Comment by Testo [ 09/Mar/11 ] |
|
add patch |