Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21062

A REMOVED node that is ahead of the other nodes in the set can prevent a primary from being elected

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 3.2.0-rc0
    • 3.2.0-rc1
    • None
    • None
    • Fully Compatible
    • ALL

    Description

      Following an upgrade of mmapv1 SCCC config servers to CSRS, I sometimes (about 60% of the time) see the new replica set get stuck without a primary after the first config server is restarted without --configsvrMode=sccc set and enters the REMOVED state. The remaining 3 replica set members stay in SECONDARY state.

      This is with commit dbbc9a2e3d8c4d7fe1748fa980ba7d01b9489dbe.

      rs.status():

      csrs:REMOVED> rs.status()
      {
      	"set" : "csrs",
      	"date" : ISODate("2015-10-21T21:51:22.697Z"),
      	"myState" : 10,
      	"term" : NumberLong(1),
      	"configsvr" : true,
      	"heartbeatIntervalMillis" : NumberLong(2000),
      	"members" : [
      		{
      			"_id" : 0,
      			"name" : "neurofunk.local:9007",
      			"health" : 1,
      			"state" : 10,
      			"stateStr" : "REMOVED",
      			"uptime" : 53,
      			"optime" : {
      				"ts" : Timestamp(1445464229, 1),
      				"t" : NumberLong(1)
      			},
      			"optimeDate" : ISODate("2015-10-21T21:50:29Z"),
      			"infoMessage" : "could not find member to sync from",
      			"configVersion" : 3,
      			"self" : true
      		},
      		{
      			"_id" : 1,
      			"name" : "neurofunk.local:53836",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 52,
      			"optime" : {
      				"ts" : Timestamp(1445464217, 1),
      				"t" : NumberLong(1)
      			},
      			"optimeDate" : ISODate("2015-10-21T21:50:17Z"),
      			"lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"),
      			"lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"),
      			"pingMs" : NumberLong(0),
      			"configVersion" : 3
      		},
      		{
      			"_id" : 2,
      			"name" : "neurofunk.local:53835",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 52,
      			"optime" : {
      				"ts" : Timestamp(1445464217, 1),
      				"t" : NumberLong(1)
      			},
      			"optimeDate" : ISODate("2015-10-21T21:50:17Z"),
      			"lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"),
      			"lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"),
      			"pingMs" : NumberLong(0),
      			"configVersion" : 3
      		},
      		{
      			"_id" : 4,
      			"name" : "neurofunk.local:53834",
      			"health" : 1,
      			"state" : 2,
      			"stateStr" : "SECONDARY",
      			"uptime" : 52,
      			"optime" : {
      				"ts" : Timestamp(1445464217, 1),
      				"t" : NumberLong(1)
      			},
      			"optimeDate" : ISODate("2015-10-21T21:50:17Z"),
      			"lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"),
      			"lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"),
      			"pingMs" : NumberLong(0),
      			"configVersion" : 3
      		}
      	],
      	"ok" : 1,
      	"$gleStats" : {
      		"lastOpTime" : Timestamp(0, 0),
      		"electionId" : ObjectId("000000000000000000000000")
      	}
      }
      csrs:REMOVED> 
      

      I will attach logs.

      Attachments

        1. SERVER-21062.tar.gz
          55 kB
        2. original-configsvr1-pre-second-restart.log
          39 kB
        3. original-configsvr1-post-second-restart.log
          276 kB
        4. new-configsvr3.log
          49 kB
        5. new-configsvr2.log
          56 kB
        6. new-configsvr1.log
          51 kB

        Issue Links

          Activity

            People

              scotthernandez Scott Hernandez
              tim.olsen@mongodb.com Timothy Olsen
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: