Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31461

Resmoke stepdown hook should deal with NotMaster errors

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Testing Infrastructure
    • Labels:
      None
    • Sprint:
      TIG 2017-10-23, TIG 2017-11-13

      Description

      Right now it seems like the stepdown thread's main loop doesn't wait for a new primary to be elected before sending another replSetStepDown command. This means that it's possible to send a replSetStepDown command to a server that's not a primary, and thus to receive a NotMaster error. Here's an example of a patch build where this happens (search for "not primary"):

      https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_ssl_jstestfuzz_concurrent_sharded_continuous_stepdown_patch_9e72a50f1ede62ab9f5899cf8f10dd93ca0c45d1_59d7e4d5e3c3312e74002b4d_17_10_06_20_18_01/0?type=T&text=true

      I think the StepDownThread should deal with these NotMaster errors and ignore them, just as it does with "connection failure" errors.

      Another solution would be for the thread to wait until a primary is elected before stepping a node down.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              max.hirschhorn Max Hirschhorn
              Reporter:
              ian.boros Ian Boros
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: