[SERVER-25777] StopMongoProgram shouldn't implicitly switch to SIGKILL Created: 24/Aug/16  Updated: 23/Sep/19  Resolved: 29/Sep/16

Status: Closed
Project: Core Server
Component/s: Shell, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 3.2.11, 3.4.0-rc0

Type: Bug Priority: Major - P3
Reporter: Mathias Stearn Assignee: Eric Milkie
Resolution: Done Votes: 1
Labels: test-only, todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-24249 shell process handling improvements Closed
Related
related to SERVER-43513 Complete TODO listed in SERVER-25777 Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Completed:
Backport Requested:
v3.0
Participants:
Linked BF Score: 0

 Description   

StopMongoProgram (which powers the shell's MongoRunner.stopMongod) will switch to SIGKILL after waiting a while, then still reports successful shutdown. This both masks bugs that deadlock at shutdown and can cause problems for tests that actually need to wait for full clean shutdown. The only indication that this happens is a line in the log, but no one will read that if the test is passing.

Instead, the shell will now wait forever for a mongod program to terminate via a SIGTERM. If a process indeed is deadlocked at shutdown, resmoke will now run the hang analyzer on the stuck process.



 Comments   
Comment by Githook User [ 02/Nov/16 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-25777 When stopping a spawned process, MongoDB shell will now abort after implicitly falling back to SIGKILL after timeout.

(cherry picked from commit 0b879b315473bb2d0a296a782a1dc8cf7fac8e20)
Branch: v3.2
https://github.com/mongodb/mongo/commit/7a7d6144bb18878895087d775ffefd5cfdbb0d65

Comment by Eric Milkie [ 29/Sep/16 ]

The commit message for this was incorrect; the shell will now hang and not implicitly fall back to SIGKILL when stopping a spawned process. There is no timeout.

Comment by Githook User [ 29/Sep/16 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-25777 When stopping a spawned process, MongoDB shell will now abort after implicitly falling back to SIGKILL after timeout.
Branch: master
https://github.com/mongodb/mongo/commit/0b879b315473bb2d0a296a782a1dc8cf7fac8e20

Comment by Eric Milkie [ 14/Sep/16 ]

Indeed. If we have any such tests, we can change their mongod shutdowns to explicitly use SIGKILL.

Comment by Max Hirschhorn [ 14/Sep/16 ]

milkie, I wonder if we'll find any tests that left the server fsyncLock()'d and were relying on this behavior to shut down the server (SERVER-17589).

Comment by Eric Milkie [ 14/Sep/16 ]

This used to be necessary on Windows before we created the special channel to shut down processes without sending the database a shutdown command over the wire. I believe that today it is no longer necessary to have the shell terminate processes without observing their exit statuses.

Comment by Eric Milkie [ 08/Sep/16 ]

+1 for this.

Generated at Thu Feb 08 04:10:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.