[SERVER-6492] isMaster() hangs when called on REMOVED node Created: 17/Jul/12  Updated: 11/Jul/16  Resolved: 23/Jul/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.1.2
Fix Version/s: 2.2.0-rc1

Type: Bug Priority: Critical - P2
Reporter: Tyler Brock Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

When you call rs.isMaster() on a REMOVED node in a replica set it just hangs.



 Comments   
Comment by auto [ 23/Jul/12 ]

Author:

{u'date': u'2012-07-23T08:29:17-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-6492 isMaster() no longer hangs on a REMOVED node
Branch: master
https://github.com/mongodb/mongo/commit/e40c25fc61b0c3db0d6fdfab414ad5d0e5da4d19

Comment by Tyler Brock [ 18/Jul/12 ]

I think that at the very least isMaster should return "isMaster" along with ok.

Even running on a standalone mongod it gives me this:

{
	"ismaster" : true,
	"maxBsonObjectSize" : 16777216,
	"localTime" : ISODate("2012-07-18T02:17:53.924Z"),
	"ok" : 1
}

For historical reasons, isMaster is used by drivers to also verify the maxBsonObjectSize, and get the time of the server. I think even if the member has been removed it needs, at a minimum, those three fields if not more that provide additional information such as the set it was a member of ("setName") etc.

Comment by Eric Milkie [ 17/Jul/12 ]

Fair enough, we'll have to fix this for 2.2. Would it be ok for isMaster to return ok:0?

Comment by Tyler Brock [ 17/Jul/12 ]

A majority of the MongoDB drivers depend on being able to call isMaster() on all members of a replica set to determine the health of the set, determine who the primary is, available non-hidden secondaries for reads, etc.

If the driver has not yet detected that a node has been removed and attempts to call isMaster on them (which is the case for most drivers by default) it will hang.

Comment by Eric Milkie [ 17/Jul/12 ]

Why would you call rs.isMaster() on a REMOVED node using the shell? This issue seems cosmetic and would be hard to do if you were using a driver to connect to a replica set. You have to force the shell to directly connect to a removed node, after you've removed it.
It won't be easy to fix except with a hack. If you add the node back into the replica set, or shut it down, the command stops hanging.

Generated at Thu Feb 08 03:11:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.