[SERVER-9588] graceful shutdown Created: 06/May/13  Updated: 04/Sep/14  Resolved: 04/Sep/14

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Vincent Sevel Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-9589 semi-automatic new primary election /... Closed
Participants:

 Description   

the lack of graceful shutdown makes it is very easy today to get some data loss with a slow replication link.

take the use case where you write with concern JOURNALED, you insert several thousands documents and you shutdown the primary node. in a configuration with another node and an arbiter, the second node will become primary in a matter of seconds, and will start serving reading and writing clients.
by the time the old primary restarts and generates a rollback file, the new primary will have accepted numerous writes, some of them that will not be able to be merged with the rollback file (not even manually). not to mention that while the old primary is down, reading clients will get some state that is older than the state that was previously accepted.

in the context of mongo, a graceful shutdown should

  • disallow writes
  • wait for at least one another node to be up to date
  • step down
  • shutdown

as an example, I wrote MongoShutdown.java that takes care of gracefully shutting down a node in a topology with 2 nodes and an arbiter. I believe however, that this kind of service should be directly provided by the server.



 Comments   
Comment by Vincent Sevel [ 14/May/13 ]

after the step down is successful, I have to wait until the secondary is up to date (this might take some seconds), then only at that time can I shutdown the node.
coordinating these different actions in a gracefulShutdown() command would make sense, and allow users to simply stop the mongo service (ie: stop the windows service on windows) without fear of loosing some data.

Comment by Eliot Horowitz (Inactive) [ 14/May/13 ]

See my comment on SERVER-9589, I think stepDown already does what you want.

Generated at Thu Feb 08 03:20:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.