[SERVER-27012] Cannot add hosts to replica set on Docker 1.13.0-rc1 swarm network. Created: 13/Nov/16  Updated: 13/Nov/16  Resolved: 13/Nov/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.10, 3.4.0-rc2
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Daniel Shannon Assignee: Unassigned
Resolution: Done Votes: 0
Labels: replicaset
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Hey Mongoids,
I'm having some difficulty creating a replica set on a Docker Swarm cluster, and I was hoping that you could help me figure out why. I've created three separate services--mongo1, mongo2, and mongo3--from the mongo:latest Docker image using the following command:

docker service create --name=mongo1 --mode=replicated --replicas=1 --publish=27017 --network=ignotae-private mongo:latest mongod --replSet=ignotae --storageEngine=wiredTiger

The three containers come up successfully, usually on hosts that are coincidentally physically separate. From each container, I'm able to ping the others (in my configuration, they typically resolve to the IP addresses 10.0.72.2, 10.0.72.4, and 10.0.72.6) and connect to them with the Mongo shell using their hostnames.

In mongo1, I initiate a replica set and update the first member's hostname:

> rs.initiate()
{
	"info2" : "no configuration specified. Using a default configuration for the set",
	"me" : "1e0b80e3f356:27017",
	"ok" : 1
}
ignotae:SECONDARY> cfg = rs.config()
{
	"_id" : "ignotae",
	"version" : 1,
	"protocolVersion" : NumberLong(1),
	"members" : [
		{
			"_id" : 0,
			"host" : "1e0b80e3f356:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		}
	],
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatIntervalMillis" : 2000,
		"heartbeatTimeoutSecs" : 10,
		"electionTimeoutMillis" : 10000,
		"getLastErrorModes" : {
 
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		},
		"replicaSetId" : ObjectId("582806f337c79bc4a39690c2")
	}
}
ignotae:PRIMARY> cfg.members[0].host = 'mongo1:27017'
mongo1:27017
ignotae:PRIMARY> rs.reconfig(cfg)
{ "ok" : 1 }

All seems to be well, but begins to go awry when I attempt to add another service to the set:

ignotae:PRIMARY> rs.add('mongo2:27017')
{
	"ok" : 0,
	"errmsg" : "The hosts mongo1:27017 and mongo2:27017 all map to this node in new configuration version 3 for replica set ignotae",
	"code" : 103
}

The only log line that seemed pertinent was this one:

2016-11-13T06:58:42.869+0000 E REPL     [conn2] replSetReconfig got DuplicateKey: The hosts mongo1:27017 and mongo2:27017 all map to this node in new configuration version 3 for replica set ignotae while validating { _id: "ignotae", version: 3, protocolVersion: 1, members: [ { _id: 0, host: "mongo1:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1.0, host: "mongo2:27017" } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 2000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('58280e35059ee85bc11db099') } }

For the life of me, I can't figure out what causes this check to fail---the network seems functional and the services are isolated and distinct in every conceivable way. I'm working on rolling back to Docker 1.12 instead of the 1.13 release candidate to see if some change in their networking stack caused this, but I'd be grateful to know if you had any insight into what might cause MongoDB to believe that hosts are duplicates of one another.

Thanks so much, in advance, for your thoughts!



 Comments   
Comment by Daniel Shannon [ 13/Nov/16 ]

Issue created on Docker's GitHub here: https://github.com/docker/docker/issues/28358.

Comment by Ramon Fernandez Marina [ 13/Nov/16 ]

This data in the description above looks suspicious:

	"members" : [
		{
			"_id" : 0,
			"host" : "1e0b80e3f356:27017",

While in the output you provided the first host shows as mongo1:27017. My guess is that someting in the 1.13 version is making hosts mongo1 and mongo2 to resolve to the same host, and triggering the rs.add() check and error message. You may want to try running rs.initiate() in the mongo2 host and see if rs.conf() also shows mongo2 as "host" : "1e0b80e3f356:27017".

I'm going to close this ticket, since this appears to be an issue with host resolution on the docker side and the SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience.

Regards,
Ramón.

Comment by Daniel Shannon [ 13/Nov/16 ]

After rolling back to 1.12, I was able to create a replica set as expected:

MongoDB shell version v3.4.0-rc2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.0-rc2
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
	http://docs.mongodb.org/
Questions? Try the support group
	http://groups.google.com/group/mongodb-user
Server has startup warnings:
2016-11-13T12:32:08.972+0000 I STORAGE  [initandlisten]
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten]
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten]
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten]
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2016-11-13T12:32:09.360+0000 I CONTROL  [initandlisten]
> rs.initiate()
 
{
	"info2" : "no configuration specified. Using a default configuration for the set",
	"me" : "37c034d04579:27017",
	"ok" : 1
}
ignotae:SECONDARY>
ignotae:SECONDARY> cfg = rs.config()
{
	"_id" : "ignotae",
	"version" : 1,
	"protocolVersion" : NumberLong(1),
	"members" : [
		{
			"_id" : 0,
			"host" : "37c034d04579:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		}
	],
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatIntervalMillis" : 2000,
		"heartbeatTimeoutSecs" : 10,
		"electionTimeoutMillis" : 10000,
		"catchUpTimeoutMillis" : 2000,
		"getLastErrorModes" : {
 
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		},
		"replicaSetId" : ObjectId("58285d76f234de7e9931ba48")
	}
}
ignotae:PRIMARY> cfg.members[0].host = 'mongo1:27017'
mongo1:27017
ignotae:PRIMARY> rs.reconfig(cfg)
{ "ok" : 1 }
ignotae:PRIMARY> rs.add('mongo2:27017')
{ "ok" : 1 }
ignotae:PRIMARY> rs.add('mongo3:27017')
{ "ok" : 1 }
ignotae:PRIMARY> rs.config()
{
	"_id" : "ignotae",
	"version" : 4,
	"protocolVersion" : NumberLong(1),
	"members" : [
		{
			"_id" : 0,
			"host" : "mongo1:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 1,
			"host" : "mongo2:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 2,
			"host" : "mongo3:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
 
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		}
	],
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatIntervalMillis" : 2000,
		"heartbeatTimeoutSecs" : 10,
		"electionTimeoutMillis" : 10000,
		"catchUpTimeoutMillis" : 2000,
		"getLastErrorModes" : {
 
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		},
		"replicaSetId" : ObjectId("58285d76f234de7e9931ba48")
	}
}
ignotae:PRIMARY>

I'll certainly kick this to the Docker folks, but it would still be great if you all had any insight into exactly what MongoDB was seeing here that threw it.

Generated at Thu Feb 08 04:13:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.