[SERVER-27817] mongod does not response after closing and starting windows service versions 3.4.0,3.4.1 Created: 26/Jan/17  Updated: 24/Aug/17  Resolved: 17/Jul/17

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: 3.4.0, 3.4.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Itzhak Kagan Assignee: Mark Benvenuto
Resolution: Cannot Reproduce Votes: 2
Labels: windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File MongoShellFailedConnection.json     File MongoShellFailedConnection.pcapng     Text File mydblog.log     Text File mydblog.log     Text File mydblog.log     Text File mydblog.log     File mydblog.log.2017-03-12T12-16-34     File mydblog.log.2017-03-12T12-29-17     File mydblog.log.2017-03-14T10-03-20     File mydblog.log.2017-03-14T10-21-57     File mydblog.log.2017-03-14T11-36-51     File mydblog.log.2017-03-14T11-40-08    
Operating System: ALL
Steps To Reproduce:

start a 3.4.1 mongod windows service.
stop it. you will get an error as posted in SERVER-27782.
restart the service.
try to connect with the mongo shell.
after 5 seconds you will get the message:
2017-01-26T08:45:14.613+0200 W NETWORK [main] Failed to connect to xx.xx.xx.xxx:<port number> after 5000ms milliseconds, giving up.
2017-01-26T08:45:14.620+0200 E QUERY [main] Error: couldn't connect to server <machine name>:<port number>, connection attempt failed :
connect@src/mongo/shell/mongo.js:234:13
@(connect):1:6
exception: connect failed

In order to successfully connect you will have to stop the service and restart it again

Participants:

 Description   

On versions 3.4.0 and 3.4.1
after stopping a mongod windows service and starting it again the mongod process does not respond.
this happens also after a system shut down.
I don't know if it relates to SERVER-6065 and or SERVER-27782 but it's a real problem.

Needless to say that that situation prevents us from deploying our 3.4 solution.



 Comments   
Comment by Ramon Fernandez Marina [ 24/Aug/17 ]

Author:

{'username': u'hptabster', 'name': u'Jonathan Abrahams', 'email': u'jonathan@mongodb.com'}

Message:SERVER-27817 Remove extraneous messages from hang_analyzer.py
Branch:master
https://github.com/mongodb/mongo/commit/b2ccf54b278299455e80206e9a6836ac90e5a670

Comment by Ramon Fernandez Marina [ 17/Jul/17 ]

itzikkg, unfortunately we have not been able to reproduce this issue, so I'm going to close this ticket.

I'd recommend upgrading to the latest 3.4 (3.4.6 at the time of this writing) and if you're still experiencing issues open a new ticket.

Thanks,
Ramón.

Comment by Mark Benvenuto [ 19/Jun/17 ]

I have not been able to repro this. I used 3.4.2 SSL version on Windows 10 (16053.413) and Windows 2012 (6.2 Build 9200). On reboot of the machine, I can connect from a separate machine to mongod on the Windows machine. I setup the firewall to allow connections through to tcp port 27017.

Comment by Itzhak Kagan [ 26/Mar/17 ]

I checked the nightly build version v3.4.3-rc2. From My checks the same behavior happens in 3.2.12.
When I install a mongod windows service that uses the machine name as the bindIp, and perform a server (machine) restart, then, on some networks, the windows (mongod) service is missing something.
When I set the service “Startup type” to “Automatic” then the service is seemingly up and running but it’s impossible to connect to it unless I restart the service manually.
If I set the service “Startup type” to “Automatic (Delayed Start)” the service loads correctly and it’s possible to connect to it. But of course it is not a solution cause the consecutive downtime rise.
In that respect there is no difference between 3.2.12 and 3.4.3-rc2.
I did some checks with WireShark tool to check the network communications, the files: a native wireshark file and a json version of it, are attached (MongoShellFailedConnection.pcapng, MongoShellFailedConnection.json)

Thanks,
Itzhak

Comment by Itzhak Kagan [ 14/Mar/17 ]

I took the nightly build version v3.4.3-rc1.
I did succeeded to upgrade the databases from 3.2.11 to that version.
I stopped the the mongod service and started it couple of times and all was good. Then I did shut down my machine and started it back again. At that point I tried to run mongo.exe and it failed (the process was up). The message is:
D:\MongoDb\bin_32to34\mongo.exe --host <machine name> --port 47017 -u <user> -p <password> --a uthenticationDatabase admin
MongoDB shell version v3.4.3-rc1
connecting to: mongodb://<machine name>:47017/
2017-03-14T13:42:25.171+0200 W NETWORK [thread1] Failed to connect to 10.36.33.232:47017 after 5000ms milliseconds, giving up.
2017-03-14T13:42:25.178+0200 E QUERY [thread1] Error: couldn't connect to server <machine name>:47017, conn
ection attempt failed :
connect@src/mongo/shell/mongo.js:237:13
@(connect):1:6
exception: connect failed
I attached five log files from date 2017/03/14 for you to see.
Thanks,
Itzhak

Comment by Itzhak Kagan [ 12/Mar/17 ]

I took the nightly build version v3.4.3-rc0-3-g6ce1c2d.

I did succeeded to upgrade the databases from 3.2.11 to that version.

I stopped the the mongod service and started it couple of times and all was good. Then I did shut down my machine and started it back again. At that point I tried to run mongo.exe and it failed (the process was up). The message is:

D:\MongoDb\bin_32to34\mongo.exe --host <machine name> --port 47017 -u <user> -p <password> --a uthenticationDatabase admin
MongoDB shell version v3.4.3-rc0-3-g6ce1c2d
connecting to: mongodb://<machine name>:47017/
2017-03-12T18:57:29.996+0200 W NETWORK [thread1] Failed to connect to 10.36.33.232:47017 after 5000ms milliseconds, giving up.
2017-03-12T18:57:29.996+0200 E QUERY [thread1] Error: couldn't connect to server <machine name>:47017, connection attempt failed :
connect@src/mongo/shell/mongo.js:237:13
@(connect):1:6
exception: connect failed

I attached three log files for you to see.

Thanks,
Itzhak

Comment by Kelsey Schubert [ 08/Mar/17 ]

Hi itzikkg,

SERVER-6065 will be corrected in MongoDB 3.4.3, if you are in a position to test whether this fix also resolved the behavior you're observing, would you please download a 3.4 nightly build. If not, would you please wait for the release of MongoDB 3.4.3 and let us know if it resolves the issue?

Thank you,
Thomas

Comment by Itzhak Kagan [ 05/Feb/17 ]

I was too happy with the previous comment.
the bug still occurs after system restart.

Remember that you should upgrade a 3.2 server to 3.4 server with the command:
db.adminCommand(

{ setFeatureCompatibilityVersion: "3.4" }

)

Please check this scenario.

Thanks,
Itzhak

Comment by Itzhak Kagan [ 05/Feb/17 ]

In version 3.4.2 this is no longer occurs.
I hope that it was done intentionally, which means that the bug will not pop up again at upcoming versions.

Thanks,
Itzhak

Comment by Itzhak Kagan [ 01/Feb/17 ]

The server was a 3.2 server and was upgraded to 3.4 with the command: db.adminCommand(

{ setFeatureCompatibilityVersion: "3.4" }

)

Log of mongod procces is attached

Comment by Kelsey Schubert [ 30/Jan/17 ]

Hi itzikkg,

I haven't been able to reproduce the the behavior you describe. From the command prompt. I execute the following:

Microsoft Windows [Version 6.2.9200]
(c) 2012 Microsoft Corporation. All rights reserved.
 
C:\Windows\system32>net stop MongoDB
The MongoDB service is stopping.
A system error has occurred.
 
System error 1067 has occurred.
 
The process terminated unexpectedly.
 
The MongoDB service was stopped successfully.
 
C:\Windows\system32>net start MongoDB
The MongoDB service is starting..
The MongoDB service was started successfully.
 
C:\Windows\system32>"C:\Program Files\MongoDB\Server\3.4\bin\mongo.exe
MongoDB shell version v3.4.1
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.1
Server has startup warnings:
2017-01-30T12:02:10.530-0500 I CONTROL  [initandlisten]
2017-01-30T12:02:10.530-0500 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-01-30T12:02:10.530-0500 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-01-30T12:02:10.530-0500 I CONTROL  [initandlisten]
> db.version()
3.4.1
>

So we can continue to investigate, would you please provide the mongod logs as well as a step by step guide to reproduce this issue?

Thank you for your help,
Thomas

Comment by Itzhak Kagan [ 26/Jan/17 ]

The workaround might be temporarily good for development environment. Would you go with that workaround to production? The answer is definitely NO!

Comment by Kelsey Schubert [ 26/Jan/17 ]

Hi itzikkg,

Thank you for the report – we're investigating. Since you've identified a workaround, I'm lowering the priority of this ticket.

Kind regards,
Thomas

Generated at Thu Feb 08 04:16:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.