[SERVER-27138] Cannot run replSetStepDown for a 3.4 replica set with journaling disabled Created: 21/Nov/16  Updated: 06/Dec/22  Resolved: 28/Nov/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.4.0-rc4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Cailin Nelson Assignee: Backlog - Replication Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-27217 Allow nopreallocj to be specified in ... Closed
Related
related to DOCS-9367 summarize use of writeConcernMajority... Closed
Assigned Teams:
Replication
Operating System: ALL
Participants:

 Description   
  1. Start with a 3 node replica set in which all nodes are wiredTiger with journaling disabled
  2. Make a single write
  3. Attempt to step down the primary

When attempting the step down command you will get:

backup_test:PRIMARY> rs.stepDown()
{
	"ok" : 0,
	"errmsg" : "No electable secondaries caught up as of 2016-11-20T21:56:10.466+0000. Please use {force: true} to force node to step down.",
	"code" : 50,
	"codeName" : "ExceededTimeLimit"
}

The rs.status() looks as below. Not that all the optimeDurable are undefined:

backup_test:PRIMARY> rs.status()
{
	"set" : "backup_test",
	"date" : ISODate("2016-11-20T21:55:49.329Z"),
	"myState" : 1,
	"term" : NumberLong(2),
	"heartbeatIntervalMillis" : NumberLong(2000),
	"optimes" : {
		"lastCommittedOpTime" : {
			"ts" : Timestamp(0, 0),
			"t" : NumberLong(-1)
		},
		"appliedOpTime" : {
			"ts" : Timestamp(1479678946, 1),
			"t" : NumberLong(2)
		},
		"durableOpTime" : {
			"ts" : Timestamp(0, 0),
			"t" : NumberLong(-1)
		}
	},
	"members" : [
		{
			"_id" : 0,
			"name" : "cailinmac:27000",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 7,
			"optime" : {
				"ts" : Timestamp(1479678946, 1),
				"t" : NumberLong(2)
			},
			"optimeDurable" : {
				"ts" : Timestamp(1479678888, 1),
				"t" : NumberLong(1)
			},
			"optimeDate" : ISODate("2016-11-20T21:55:46Z"),
			"optimeDurableDate" : ISODate("2016-11-20T21:54:48Z"),
			"lastHeartbeat" : ISODate("2016-11-20T21:55:48.020Z"),
			"lastHeartbeatRecv" : ISODate("2016-11-20T21:55:46.518Z"),
			"pingMs" : NumberLong(0),
			"syncingTo" : "cailinmac:27020",
			"configVersion" : 1
		},
		{
			"_id" : 1,
			"name" : "cailinmac:27010",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 140,
			"optime" : {
				"ts" : Timestamp(1479678946, 1),
				"t" : NumberLong(2)
			},
			"optimeDate" : ISODate("2016-11-20T21:55:46Z"),
			"infoMessage" : "could not find member to sync from",
			"electionTime" : Timestamp(1479678905, 1),
			"electionDate" : ISODate("2016-11-20T21:55:05Z"),
			"configVersion" : 1,
			"self" : true
		},
		{
			"_id" : 2,
			"name" : "cailinmac:27020",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 120,
			"optime" : {
				"ts" : Timestamp(1479678946, 1),
				"t" : NumberLong(2)
			},
			"optimeDurable" : {
				"ts" : Timestamp(0, 0),
				"t" : NumberLong(-1)
			},
			"optimeDate" : ISODate("2016-11-20T21:55:46Z"),
			"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
			"lastHeartbeat" : ISODate("2016-11-20T21:55:47.952Z"),
			"lastHeartbeatRecv" : ISODate("2016-11-20T21:55:48.912Z"),
			"pingMs" : NumberLong(0),
			"syncingTo" : "cailinmac:27010",
			"configVersion" : 1
		}
	],
	"ok" : 1
}

Here are the command line options for a member. If I enable journaling, this problem disappears.

backup_test:PRIMARY> db.runCommand({getCmdLineOpts:1})
{
	"argv" : [
		"mongod",
		"--port=27000",
		"--replSet=backup_test",
		"--dbpath=/Users/cailin/Documents/code/mms/data/db/replica/backup_test/backup_test_0",
		"--logpath=/Users/cailin/Documents/code/mms/data/db/replica/backup_test/backup_test_0/mongodb.log",
		"--logappend",
		"--oplogSize=100",
		"--storageEngine=wiredTiger",
		"--nojournal",
		"--wiredTigerEngineConfigString=cache_size=512MB"
	],
	"parsed" : {
		"net" : {
			"port" : 27000
		},
		"replication" : {
			"oplogSizeMB" : 100,
			"replSet" : "backup_test"
		},
		"storage" : {
			"dbPath" : "/Users/cailin/Documents/code/mms/data/db/replica/backup_test/backup_test_0",
			"engine" : "wiredTiger",
			"journal" : {
				"enabled" : false
			},
			"wiredTiger" : {
				"engineConfig" : {
					"configString" : "cache_size=512MB"
				}
			}
		},
		"systemLog" : {
			"destination" : "file",
			"logAppend" : true,
			"path" : "/Users/cailin/Documents/code/mms/data/db/replica/backup_test/backup_test_0/mongodb.log"
		}
	},
	"ok" : 1
}



 Comments   
Comment by Eric Milkie [ 22/Nov/16 ]

After a discussion with kay.kim regarding DOCS-8576 and its associated commits, I filed DOCS-9367 to make it absolutely clear what is needed. Setting the flag correctly is more important now that SERVER-26747 has been implemented.

Comment by Daniel Pasette (Inactive) [ 22/Nov/16 ]

milkie, can you link to the DOCS ticket which details requirement of using writeConcernMajorityJournalDefault : false when running without journaling or inMemory?

Comment by Cailin Nelson [ 21/Nov/16 ]

No. I was just using the shell and inserting a document. I presume that is w:1.

Comment by Eric Milkie [ 21/Nov/16 ]

I believe that w:majority writes would also not be working for you without that flag. Are you using w:majority writes?

Comment by Cailin Nelson [ 21/Nov/16 ]

Yes. If I set writeConcernMajorityJournalDefault: false then the rs.stepDown() command succeeds.

Comment by Eric Milkie [ 21/Nov/16 ]

Does it work if you set writeConcernMajorityJournalDefault to false in your replica set config?

{ _id: "foo",
 version: 1,
writeConcernMajorityJournalDefault: false,
members: { etc }
}

Generated at Thu Feb 08 04:14:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.