[SERVER-21019] MongoS service does not restart as per service defined behaviour if all config servers are down Created: 19/Oct/15  Updated: 24/Jun/19  Resolved: 24/Jun/19

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: 3.0.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Anil Kumar Assignee: DO NOT USE - Backlog - Platform Team
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

v3.0.4, Windows 7


Issue Links:
Related
is related to SERVER-17295 On Windows, service recovery actions ... Closed
Operating System: ALL
Participants:

 Description   

On a single-replicaset environment with a MongoS, all servers (mongoD, mongoS and 3 config servers) are configured as Windows services. All services have recovery actions that should cause Windows to restart the services 10 seconds later. Following command is used to specify the recovery action:

sc failure <mongod/mongos/mongodbconfigX> reset= 0 actions= restart/10000/restart/10000/restart/10000

If all 3 config servers are stopped and the a MongoS is started, MongoS exits (expected), however it does not automatically restart after 10 seconds as specified in the above mentioned service configuration command.

Following is the output of the sc query mongos:

SERVICE_NAME: mongos
        TYPE               : 10  WIN32_OWN_PROCESS
        STATE              : 1  STOPPED
        WIN32_EXIT_CODE    : 1066  (0x42a)
        SERVICE_EXIT_CODE  : 5  (0x5)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0

The suspicion is that the Windows doesn't restart MongoS that was exited with a state STOPPED instead of PAUSED.



 Comments   
Comment by Ramon Fernandez Marina [ 14/Oct/16 ]

Hi alessandro.gherardi@yahoo.com, I'm glad to hear you're not seeing this issue in 3.2.9, although I'm not aware of any modification on our end that could have accounted for the behavior change – I'll take a closer look.

Thanks,
Ramón.

Comment by Alessandro Gherardi [ 14/Oct/16 ]

Any updates?

Comment by Alessandro Gherardi [ 23/Sep/16 ]

I switched to using config server replica sets with Mongo 3.2.9 in my development environment and this issue no longer occurs.

Can you please confirm this is by design and if so close this ticket?

Thank you in advance.

Comment by Alessandro Gherardi [ 10/Mar/16 ]

If I call:

sc failureflag MongoS 1

and the config servers are down, the mongoS service cannot be stopped via "net stop MongoS". The only way to recover is to set the failureflag back to 0, then call "net stop MongoS".

In other words, I don't think that setting failureflag to 1 is a viable solution.

Comment by Mark Benvenuto [ 20/Oct/15 ]

It depends on the following setting being 1.

>sc qfailureflag
DESCRIPTION:
        Retrieves the failure actions flag setting of a service.
        If this setting is 0 (default), the Service Control Manager
        (SCM) enables configured failure actions on the service
        only if the service process terminates with the service in
        a state other than SERVICE_STOPPED. If this setting is 1,
        the SCM enables configured failure actions on the service
        if the service enters the SERVICE_STOPPED state with a Win32
        exit code other than 0 in addition to the service process
        termination as above. This setting is ignored if the service
        does not have any failure actions configured.
USAGE:
        sc <server> qfailureflag [service name]

See https://msdn.microsoft.com/en-us/library/windows/desktop/ms685937(v=vs.85).aspx for more information.

Generated at Thu Feb 08 03:56:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.