Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45686

Increase topologyVersion and respond to waiting isMasters on mock State Change Errors from the failCommand failpoint

    • Type: Icon: Task Task
    • Resolution: Gone away
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Repl 2020-02-10

      When the server responds with a State Change Errors from the failCommand failpoint, it should also increase topologyVersion and respond to waiting isMasters. The Drivers team uses failCommand extensively in spec tests for retryable writes+reads. Without this change, it takes the client ~10 seconds (maxAwaitTimeMS) to rediscover the server's state.

      For example:

      1. client configures a failCommand with NotMaster
      2. client runs a retryable write against Primary P
      3. client observes a NotMaster error and sets P to Unknown
      4. client runs the retry attempt which blocks until P is rediscovered
      5. P's Monitor is blocked for 10 seconds waiting for an awaitable isMaster response

      After this change to 10 seconds hang should be removed:

      1. client configures a failCommand with NotMaster
      2. client runs a retryable write against Primary P
      3. client observes a NotMaster error and sets P to Unknown
      4. client runs the retry attempt which blocks until P is rediscovered
      5. P's Monitor immediately receives an awaitable isMaster response and set P to Primary
      6. client retry attempt succeeds ASAP

            Assignee:
            jason.chan@mongodb.com Jason Chan
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: