Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81115

ReplicaSetAwareService Can Be Shutdown While Node is Still Primary

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Replication
    • Repl 2023-10-02, Repl 2023-10-16, Repl 2023-10-30

      On stepdown, ReplicaSetAwareServices will be notified of the stepdown as part of stepdown actions only after the node is no longer writable.

      On shutdown, in most cases this property will hold, as prior to shutting down the replication coordinator (and therefore shutting down ReplicaSetAwareServices), we will first trigger a stepdown.

      However, in rare cases, it can be possible for the node to remain primary even after ReplicaSetAwareServices are shut down if the stepdown attempt during shutdown fails (for example, because no secondaries are caught up at the time of the shutdown). The stepdown is able to fail because, despite the forceShutdown parameter perhaps suggesting otherwise, the stepdown attempt is not forced. Instead, the forceShutdown parameter only determines whether we return the actual error if the stepdown attempt does fail, or if we swallow it and return OK anyway. It should also be noted that the caller invariants that stepDownForShutdown returns OK, which is to say that we aren't going to take any meaningful action in the event that the stepdown attempt fails (e.g. by deciding to abort the shutdown).

      The ultimate consequence of this is that PrimaryOnlyServices (built on top of ReplicaSetAwareService) needed to expose its shutdown state (see SERVER-78108) in order to explicitly disambiguate "the service does not exist because it has no state document on disk" from "the service does not exist because the node is shutting down." Previously, it was thought that these cases could be disambiguated by performing a no-op write after attempting to acquire a handle to the service, based on the assumption that ReplicaSetAwareServices would not have been shut down as long as the node was still a writable primary. The violation of this assumption eventually led to BF-29013 (see this comment for specific details) and SERVER-78009.

      This ticket exists to answer a few questions:

      1. Is it possible and desirable to formalize the assumed property that ReplicaSetAwareService will not have its stepdown or shutdown hooks called until after the node is no longer a writable primary as a guarantee?
      2. Can 1. be accomplished trivially by forcing the stepdown being performed here (at least in the case that forceShutdown is true)?
      3. If the answer to 2. is no, is there some other reasonable way of maintaining this property as a guarantee?

            lingzhi.deng@mongodb.com Lingzhi Deng
            brett.nawrocki@mongodb.com Brett Nawrocki
            0 Vote for this issue
            8 Start watching this issue