Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53431

Server should respond running operations with appropriate topologyVersion on stepdown

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.9.0, 4.4.5, 4.2.16
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Fully Compatible
    • ALL
    • v4.4, v4.2
    • Repl 2021-01-25, Repl 2021-02-08
    • 0

      1. We kill operations as part of the beginning of stepdown. Calling AutoGetRstlForStepUpStepDown starts the killOp thread
      2. We start to kill user operations before we disabling writes on primary and before transitioning the server to SECONDARY (these are the things that update the server description and trigger a topologyVersion bump)
      3. The killed operation error response is appended with a topologyVersion that hasn't been incremented yet.

      Since the topologyVersion is not incremented, the driver will try to reselect the same server to run the command even though it may still be in the process of stepping down.

      We can consider adding an extra incrementation to the topologyVersion before scheduling the killOps (we already increment the topologyVersion twice as part of stepdown – once for when we disable writes, and another when we complete the transition to secondary). Another alternative is to delaying the killOps logic until the topologyVersion is properly incremented.

            matthew.russotto@mongodb.com Matthew Russotto
            jason.chan@mongodb.com Jason Chan
            0 Vote for this issue
            14 Start watching this issue