Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20832

step down command should restart heartbeats at most once

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.2.0-rc0
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Fully Compatible
    • ALL
    • RPL A (10/09/15), Repl B (10/30/15)

      While waiting for secondaries to catch up during a step down request, the primary seems to be sending out heartbeats constantly to the secondaries. The step down command should be restarting the heartbeats once and allow the replication coordinator to reschedule new heartbeats every "heartbeatIntervalMillis" ms. This bug seems to have been introduced by SERVER-20671.

      ----------

      Task
      Logs

      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.703-0400 d20010| 2015-10-07T16:57:49.703-0400 I COMMAND  [conn8] command admin.$cmd command: replSetStepDown { replSetStepDown: 60.0, secondaryCatchUpPeriodSecs: 60.0 } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:150 locks:{ Global: { acquireCount: { r: 1, R: 1 } } } protocol:op_command 61264ms
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.704-0400 d20010| 2015-10-07T16:57:49.703-0400 I COMMAND  [conn7] command admin.$cmd command: isMaster { ismaster: 1.0 } ntoreturn:1 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:0 reslen:488 locks:{} protocol:op_command 1226ms
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.704-0400 d20010| 2015-10-07T16:57:49.703-0400 I REPL     [replExecDBWorker-2] transition to SECONDARY
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.704-0400 d20010| 2015-10-07T16:57:49.704-0400 I NETWORK  [conn7] end connection 127.0.0.1:61013 (6 connections now open)
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 d20010| 2015-10-07T16:57:49.704-0400 I NETWORK  [conn8] end connection 127.0.0.1:61032 (6 connections now open)
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 d20010| 2015-10-07T16:57:49.704-0400 I NETWORK  [conn11] end connection 208.52.191.216:49333 (6 connections now open)
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 d20010| 2015-10-07T16:57:49.704-0400 I NETWORK  [conn14] end connection 208.52.191.216:49367 (5 connections now open)
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 sh20277| {
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 sh20277|   "ok" : 0,
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 sh20277|   "errmsg" : "By the time we were ready to step down, we were already past the time we were supposed to step down until",
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 sh20277|   "code" : 50
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:49.705-0400 sh20277| }
      ...
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 2015-10-07T16:57:57.682-0400 E QUERY    [thread1] Error: [0] != [0] are equal : expected replSetStepDown to close the shell's connection :
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 doassert@src/mongo/shell/assert.js:15:14
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 assert.neq@src/mongo/shell/assert.js:119:5
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 @jstests/replsets/stepdown_long_wait_time.js:95:5
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 @jstests/replsets/stepdown_long_wait_time.js:10:2
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400
      [js_test:stepdown_long_wait_time] 2015-10-07T16:57:57.685-0400 failed to load: jstests/replsets/stepdown_long_wait_time.js
      

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: