Resmoke stepdown hook should deal with NotMaster errors

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Won't Fix
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Testing Infrastructure
    • None
    • TIG 2017-10-23, TIG 2017-11-13
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      Right now it seems like the stepdown thread's main loop doesn't wait for a new primary to be elected before sending another replSetStepDown command. This means that it's possible to send a replSetStepDown command to a server that's not a primary, and thus to receive a NotMaster error. Here's an example of a patch build where this happens (search for "not primary"):

      https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_ssl_jstestfuzz_concurrent_sharded_continuous_stepdown_patch_9e72a50f1ede62ab9f5899cf8f10dd93ca0c45d1_59d7e4d5e3c3312e74002b4d_17_10_06_20_18_01/0?type=T&text=true

      I think the StepDownThread should deal with these NotMaster errors and ignore them, just as it does with "connection failure" errors.

      Another solution would be for the thread to wait until a primary is elected before stepping a node down.

            Assignee:
            Max Hirschhorn
            Reporter:
            Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: