[SERVER-22139] Config server reports that it is in replica set mode, but we are still using the legacy SCCC protocol for config server communication Created: 12/Jan/16  Updated: 26/Jan/16  Resolved: 25/Jan/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.2.0
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: João Soares Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

Just setup a sharded cluster following the 3.2 normal guide and instead of using:

rs.initiate( {
   _id: "configReplSet",
   configsvr: true,
   members: [
      { _id: 0, host: "<host1>:<port1>" },
      { _id: 1, host: "<host2>:<port2>" },
      { _id: 2, host: "<host3>:<port3>" }
   ]
} )

just add the first one and then the following 2 separately:

rs.initiate( {
   _id: "configReplSet",
   configsvr: true,
   members: [
      { _id: 0, host: "jasoares-mbp:27019" },
   ]
} );
rs.add({ host: 'jasoares-mbp:27020' });
rs.add({ host: 'jasoares-mbp:27021' })

The rest is just the normal configuration files similar to instructed on the guide:

Config Server configuration file:

systemLog:
  destination: file
  path: '/usr/local/var/log/mongodb/mongodcfg0.log'
  logAppend: true
storage:
  dbPath: '/usr/local/var/mongodcfg0'
processManagement:
  fork: true
sharding:
  clusterRole: configsvr
replication:
  replSetName: configReplSet
net:
  bindIp:
    - 127.0.0.1
  port: 27019

Config Server Conf:

{
	"_id" : "configReplSet",
	"version" : 3,
	"configsvr" : true,
	"protocolVersion" : NumberLong(1),
	"members" : [
		{
			"_id" : 0,
			"host" : "jasoares-mbp:27019",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 1,
			"host" : "jasoares-mbp:27020",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 2,
			"host" : "jasoares-mbp:27021",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"slaveDelay" : NumberLong(0),
			"votes" : 1
		}
	],
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatIntervalMillis" : 2000,
		"heartbeatTimeoutSecs" : 10,
		"electionTimeoutMillis" : 10000,
		"getLastErrorModes" : {
			
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		}
	}
}

Participants:

 Description   

When making a fresh 3.2 deployment on my development mac with 3 new and clean WiredTiger CSRS config servers I keep getting the following error messages on mongos as soon as any balancing starts to occur:

2016-01-12T01:03:12.671+0000 I SHARDING [Balancer] caught exception while doing balance: Need to swap sharding catalog manager.  Config server reports that it is in replica set mode, but we are still using the legacy SCCC protocol for config server communication
2016-01-12T01:03:12.671+0000 I SHARDING [Balancer] about to log metadata event into actionlog: { _id: "jasoares-mbp.local-2016-01-12T01:03:12.671+0000-569450d00b155d340d026352", server: "jasoares-mbp.local", clientAddr: "", time: new Date(1452560592671), what: "balancer.round", ns: "", details: { executionTimeMillis: 31, errorOccured: true, errmsg: "Need to swap sharding catalog manager.  Config server reports that it is in replica set mode, but we are still using the legacy SCCC protocol for conf..." } }

No config server configurations has the flag configsvrMode: sccc present.
Config server conf contains "protocolVersion" : NumberLong(1).

Just as I was writing this very issue I noticed I did something different that than said on the guide, I did not add the 3 hosts when initiating the CSRS, I added just one and then added the other two separately using

rs.add({ host: 'jasoares-mbp:27020' })

When I restarted all over again and did the 3 at once my problems were fixed, but the question remains, why wouldn't it work. I ended up submitting the issue as I see others doing the same.



 Comments   
Comment by Spencer Brody (Inactive) [ 15/Jan/16 ]

Hi jasoares,
Thanks for filing this report. I just backported some fixes to the 3.2 branch that should cause this problem to go away. Can you try re-testing with the unstable nightly build from last night?

Comment by João Soares [ 12/Jan/16 ]

Thank you Andy, you are right, it has nothing to do with the initiation process of the first config server on the replica set. The cause is indeed the string passed to the mongos --configdb.
May I suggest a change on the exception message to reflect this in some way? It should probably save a lot of time debugging situations like this.

I would change this to improvement if I could edit the issue.

Comment by Kaloian Manassiev [ 12/Jan/16 ]

spencer, will this bug, which is easy to cause by a typo, go away once we have the zero-downtime upgrade logic in place?

Comment by Andy Schwerin [ 12/Jan/16 ]

What string are you using for the --configdb argument to mongos? It should bea replica set connection string, like configReplSet/host1,host2,host3.

I think you're passing an old-style connection string, host1,host2,host3 (without the replset name at front).

Generated at Thu Feb 08 03:59:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.