[JAVA-440] errors when using localhost Created: 29/Sep/11  Updated: 25/Jun/13  Resolved: 09/Nov/11

Status: Closed
Project: Java Driver
Component/s: Cluster Management
Affects Version/s: 2.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kevin Williams Assignee: Antoine Girbal
Resolution: Done Votes: 0
Labels: down, localhost, master, replicaset
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MacBookPro laptop running Mac OS X 10.6.8, MongoDB 2.0.0 from machomebrew, mongo-java-driver-2.6.3.jar


Attachments: Text File Test.java     Text File Test2.java    

 Description   

Console output at the end.

Running MongoDB locally, bound to localhost. The client keeps logging ReplicaSetStatus updates - that's problem #1.

Then the client gets a connection refused from my NIC IP address - why is it trying to connect that way? Problem #2.

Then, it thinks the replicaset master (there is no replica set, mind you) has moved from localhost to localhost, but thinks that server is down. That's problems #3 and #4.

Then, it thinks the server is down from that point forward. My app will fail until I restart the app. Then this whole process starts over just a couple of minutes later.

Running our app from Ant eventually gives me the following output:

[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Sep 29, 2011 12:50:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen up: localhost:27017
[java] Starting IDM Server on port 3002
[java] Sep 29, 2011 12:55:19 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen down: localhost:27017
[java] java.io.IOException: couldn't connect to [kevinw-mbp/10.180.131.72:27017] bc:java.net.ConnectException: Connection refused
[java] at com.mongodb.DBPort._open(DBPort.java:206)
[java] at com.mongodb.DBPort.go(DBPort.java:94)
[java] at com.mongodb.DBPort.go(DBPort.java:75)
[java] at com.mongodb.DBPort.findOne(DBPort.java:129)
[java] at com.mongodb.DBPort.runCommand(DBPort.java:138)
[java] at com.mongodb.ReplicaSetStatus$Node.update(ReplicaSetStatus.java:162)
[java] at com.mongodb.ReplicaSetStatus$Node.update(ReplicaSetStatus.java:156)
[java] at com.mongodb.ReplicaSetStatus.ensureMaster(ReplicaSetStatus.java:311)
[java] at com.mongodb.DBTCPConnector.checkMaster(DBTCPConnector.java:393)
[java] at com.mongodb.ReplicaSetStatus$Updater.run(ReplicaSetStatus.java:292)
[java] Sep 29, 2011 12:55:19 PM com.mongodb.DBTCPConnector _set
[java] WARNING: Master switching from localhost:27017 to localhost:27017
[java] Sep 29, 2011 12:55:29 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen down: localhost:27017
[java] Sep 29, 2011 12:55:59 PM com.mongodb.ReplicaSetStatus$Node update
[java] WARNING: Server seen down: localhost:27017



 Comments   
Comment by Kevin Williams [ 09/Nov/11 ]

No, the machine configuration actually had nothing to do with it - we get the same ReplicaSetStatus warnings on our servers if configured for one mongodb server. I was unaware of how the different constructor signatures would affect the runtime behavior, which gives a reproducible cause & effect.

Thanks for the clarification, and thanks very much for the thorough investigation!

Sorry you couldn't be here today, Scott. Kyle says "hi".

Comment by Scott Hernandez (Inactive) [ 09/Nov/11 ]

It seems like this is really a machine configuration issue and not a driver issue. If there are other issues that have come from this discussion, please open new jira issues for them specifically.

Comment by Scott Hernandez (Inactive) [ 09/Nov/11 ]

The constructor that takes a list is exclusively for replica set connection mode. It is designed to take a seed list of servers to contact. A single item list is not a bad thing, and very useful in many cases of deployments.

If you don't want that behavior, don't use the List<T> constructors, as Ryan has stated.

We will be working to cleanup the MongoURI syntax so you can specify a mode to disable replica set connection mode when passing in a list, or to enable it when sending a single server address. Unfortunately these kinds of changes need to happen across all driver implementation (for all languages) for consistency and that takes more time.

Comment by Ryan Nitz [ 09/Nov/11 ]

Scott (another driver engineer) is going to jump in here in bit, but for now we are not going to change the behavior of Mongo(List<ServerAddress> addrs) to modify the behavior if there is only a single host because there is another constructor which takes a single ServerAddress. We very much appreciate your feedback and I will definitely update the javadocs to clarify the behavior of the various constructors.

Comment by Kevin Williams [ 09/Nov/11 ]

While I'm sure that everything you said is correct, I believe the localhost name resolution is a red herring for this issue, which turned out to be about passing a single-item list into the Mongo object constructor and having ReplicaSet connection errors as a result. I'd like to steer the discussion back to that cause/effect now.

My understanding is that if a List is passed to the Mongo constructor, it expects to manage connections to a ReplicaSet style of cluster, and attempts to track the master so that it knows where to send write operations to. It seems like there is some input validation or error checking that could be added here, to prevent a single-item List from operating is this clustered mode.

You could certainly say "You're holding it wrong." and perhaps send out a free bumper case to "resolve" the issue. I would prefer, if I may make a suggestion, to have the driver reject attempts to run with only one entry in a List, since the List-based constructor assumes this clustered mode of operation. Perhaps throw an exception, or at least log a stern warning about "bad things are about to happen"?

Comment by Ryan Nitz [ 09/Nov/11 ]

Ok... I completely understand about keeping your network private. Are the local canonical and local addr the same as your internal address or are they the external address? If they are an external address, this is a problem.

The local name is weird... most macs have something like kevinw-mbp.local (versus just kevinw-mbp). I am guessing this is related to you overriding the hostname.

As of now, I am inclined to believe that there is something slightly off on your laptop. I am a bit hesitant about putting a localhost check/resolution override into the driver because oddly enough, some people override localhost.

The easiest way around this is to use 127.0.0.1 instead of localhost for dev.

Thoughts?

Comment by Kevin Williams [ 09/Nov/11 ]

I might have used:

sudo scutil --set HostName <putinyourhostname_or_fqdn_here>

but I'm not sure.

Comment by Kevin Williams [ 09/Nov/11 ]

The uname response was:
Darwin kevinw-mbp 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386

I'm not comfortable divulging too much detail of our internal corporate network. I can say that the Test2 results were predictable localhost-style results except for the last one, the localAddr one.

canonical: localhost - addr: 127.0.0.1 - name: localhost
canonical: localhost - addr: 0:0:0:0:0:0:0:1 - name: localhost
canonical: fe80:0:0:0:0:0:0:1%1 - addr: fe80:0:0:0:0:0:0:1%1 - name: localhost
local canonical: ##.##.##.## - local addr: ##.##.##.## - local name: kevinw-mbp

I did do something a year or two ago in an attempt force my hostname, because our DHCP system was giving me a new host each day. I'm not sure what that was exactly, but it was designed to only force the hostname - if it affected what InetAddress.getLocalHost() returns, that would be an unexpected side-effect.

Comment by Ryan Nitz [ 09/Nov/11 ]

This is slightly strange...

can you run the following commands:

uname -a

ifconfig

Also, I have attached Test2.java. This is just a simple Java program that diagnosis your network.

Thanks for troubleshoot this with us...

Comment by Kevin Williams [ 08/Nov/11 ]

I get 'connection refused' errors because it's trying to connect to my public IP instead of 127.0.0.1 and I have my local MongoDB bound only to 127.0.0.1. I don't understand why it is trying to use my public IP when the code is specifying 'localhost'.

I can use 'localhost' in our app code without getting this error, but I don't understand what the difference is.

No proxy or vpn here.

Comment by Ryan Nitz [ 08/Nov/11 ]

Ok... that part of your /etc/hosts file is the same as mine.

I attached a test and I cannot recreate this issue.

Can you try running the attached file and let me know what you see?

Thanks!

Oh, do you have a proxy/vpn running on your machine?

Comment by Ryan Nitz [ 08/Nov/11 ]

Tested against 2.7.0.

Comment by Kevin Williams [ 08/Nov/11 ]

We use our local host files for a lot of internal things at work, so I can't post the whole thing.

Here's the 'localhost' portion of it.

127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost

Comment by Ryan Nitz [ 08/Nov/11 ]

Glad you have a workaround. My workaround was to use 127.0.0.1 instead of localhost

Ok... can you post the following:

cat /etc/hosts

I had the same problem, but when I upgraded to a new Mac, the problem went away.

Thx

Comment by Kevin Williams [ 08/Nov/11 ]

If I detect a single-entry list before passing it in to the Mongo object constructor, I can pass the single entry instead. So, by being smart about it on my end I can avoid the error.

If I do pass a single-entry list to the constructor, the error still happens.

I have tried both the 2.6.5 driver and the new 2.7.0 driver, and this behavior seems to be the same for all three versions.

Comment by Ryan Nitz [ 08/Nov/11 ]

Hi Kevin,

Has this been resolved? Let us know so I can close this issue.

Thanks,

-Ryan

Comment by Kevin Williams [ 29/Sep/11 ]

OK, cool.

I am using mongod 2.0.0 locally, by the way.

Yes, localhost resolves to 127.0.0.1.

We'll be upgrading the driver to 2.6.5 and the production servers from 1.8.2 to 2.0.0 after this release goes out, but for now I have to keep the 2.6.3 driver.

Thanks for the help!

Comment by Scott Hernandez (Inactive) [ 29/Sep/11 ]

Yes, if you use the list constructor it will treat it as a replSet.

With newer version of mongod (1.8+) we could detect that it isn't a replSet but not with earlier versions. It shouldn't cause problems though.

Does localhost resolve to 127.0.0.1? I think on osx it might not in some cases. Also, you might want to upgrade to 2.6.5 as it has some related fixes.

Comment by Kevin Williams [ 29/Sep/11 ]

I believe so. Our 'Repository' classes have two constructors, one for a single server, and one for a list. The one that takes a list also sets up slaveOk() and WriteConcern.REPLICAS_SAVE.

Thanks to your hint, it's a good bet that we're using the list but I only have one in my list, and that's causing all this replica-style behavior. Is that what you're thinking?

Comment by Scott Hernandez (Inactive) [ 29/Sep/11 ]

How are you constructing your Mongo instances? Are you supplying a list of one server?

Generated at Thu Feb 08 08:52:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.