[SERVER-16282] Can't initiate a ReplSet with chainingAllowed: False Created: 22/Nov/14  Updated: 14/Jan/15  Resolved: 22/Nov/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.8.0-rc1
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Alvin Richards (Inactive) Assignee: Unassigned
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-16280 chainingallowed: false can spin Closed
Operating System: ALL
Participants:

 Description   

Problem

Starting a ReplSet with chainingAllowed: false appears to hang. Nothing in the log

2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] MongoDB starting : pid=24593 port=27017 dbpath=/data2/db/db100 64-bit host=ip-10-93-7-23.ec2.internal
2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] db version v2.8.0-rc1
2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] git version: 0a02891e3f155e5a187ee14c774d42cbdb290652
2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] build info: Linux build19.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] allocator: tcmalloc
2014-11-22T01:29:45.868-0500 I CONTROL  [initandlisten] options: { net: { port: 27017 }, processManagement: { fork: true }, replication: { oplogSizeMB: 500, replSet: "mp" }, storage: { dbP
ath: "/data2/db/db100", engine: "mmapv1", mmapv1: { syncPeriodSecs: 14400.0 } }, systemLog: { destination: "file", path: "/data3/db/db100/server.log" } }
2014-11-22T01:29:45.878-0500 I JOURNAL  [initandlisten] journal dir=/data2/db/db100/journal
2014-11-22T01:29:45.879-0500 I JOURNAL  [initandlisten] recover : no journal files present, no recovery needed
2014-11-22T01:29:45.927-0500 I INDEX    [initandlisten] allocating new ns file /data2/db/db100/local.ns, filling with zeroes...
2014-11-22T01:29:45.984-0500 I STORAGE  [FileAllocator] allocating new datafile /data2/db/db100/local.0, filling with zeroes...
2014-11-22T01:29:45.984-0500 I STORAGE  [FileAllocator] creating directory /data2/db/db100/_tmp
2014-11-22T01:29:46.222-0500 I STORAGE  [FileAllocator] done allocating datafile /data2/db/db100/local.0, size: 64MB,  took 0.236 secs
2014-11-22T01:29:46.225-0500 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument Did not find replica set configuration documen
t in local.system.replset
2014-11-22T01:29:46.226-0500 I NETWORK  [initandlisten] waiting for connections on port 27017
2014-11-22T01:30:26.520-0500 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:51467 #1 (1 connection now open)
2014-11-22T01:30:26.523-0500 I NETWORK  [conn1] end connection 127.0.0.1:51467 (0 connections now open)
2014-11-22T01:30:26.525-0500 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:51468 #2 (1 connection now open)
2014-11-22T01:30:26.527-0500 I NETWORK  [conn2] end connection 127.0.0.1:51468 (0 connections now open)
2014-11-22T01:30:26.813-0500 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:51471 #3 (1 connection now open)
2014-11-22T01:30:27.259-0500 I NETWORK  [conn3] end connection 127.0.0.1:51471 (0 connections now open)
2014-11-22T01:30:36.081-0500 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:51473 #4 (1 connection now open)
2014-11-22T01:31:58.591-0500 I REPL     [conn4] replSetInitiate admin command received from client

Reproduce

/home/ec2-user/mongodb-linux-x86_64-2.8.0-rc1/bin/mongod --port 27017 --dbpath /data2/db/db100 --logpath /data3/db/db100/server.log --fork --storageEngine=mmapv1 --syncdelay 14400 --replSet mp --oplogSize 500
/home/ec2-user/mongodb-linux-x86_64-2.8.0-rc1/bin/mongod --port 27018 --dbpath /data2/db/db200 --logpath /data3/db/db200/server.log --fork --storageEngine=mmapv1 --syncdelay 14400 --replSet mp --oplogSize 500
/home/ec2-user/mongodb-linux-x86_64-2.8.0-rc1/bin/mongod --port 27019 --dbpath /data2/db/db300 --logpath /data3/db/db300/server.log --fork --storageEngine=mmapv1 --syncdelay 14400 --replSet mp --oplogSize 500
 
var config = { _id: "mp", members: [ { _id: 0, host: "localhost:27017",priority:10 }, {_id: 1, host: "localhost:27018" }, { _id: 3, host: "localhost:27019" } ],settings: {chainingAllowed: false} }; 
rs.initiate( config ); 

Workaround

With chainingAllowed: true, the ReplSet will start up just fine.



 Comments   
Comment by Alvin Richards (Inactive) [ 22/Nov/14 ]

There were two mongod processes left running on this box, which were not being killed by a killall. lsof showed that 27018 and 27019 ports were in use. Kill -9 the mongod's and then a new ReplSet start just fine.

Closed as "Works as Designed" - this was not a bug.

Comment by Scott Hernandez (Inactive) [ 22/Nov/14 ]

I just tried a few repros and can't see any problems. I used this js, which is very similar to rollback_too_new.js:

var rt = new ReplSetTest({nodes:3})
var c = { "_id" : "testReplSet", "members" : [
		{"_id" : 0,
			"host" : "localhost:31000",
			"priority" : 10},
		{"_id" : 1,
			"host" : "localhost:31001"},
		{"_id" : 2,
			"host" : "localhost:31002"}],
	"settings" : {"chainingAllowed" : false}};
rt.startSet()
rt.initiate(c)
assert.writeOK(rt.getPrimary().getDB("test").a.save({}))
rt.stopSet()

Comment by Andy Schwerin [ 22/Nov/14 ]

Interesting. We have unit tests that set chainingAllowed: false that continue to work, so something else is also going on, here. Can you attach logs, just in case we have trouble reproducing?

Comment by Alvin Richards (Inactive) [ 22/Nov/14 ]

may be related to SERVER-16280 (but that's during initialSync).

Generated at Thu Feb 08 03:40:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.