[JAVA-350] com.mongodb.MongoException: not talking to master and retries used up Created: 12/May/11  Updated: 09/Jan/14  Resolved: 10/Aug/12

Status: Closed
Project: Java Driver
Component/s: None
Affects Version/s: 2.5.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Conigliaro Assignee: Brendan W. McAdams
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related

 Description   

After replica set failover in my sharded cluster, my app throws exceptions like this:

ERR [20110512-15:47:43.401] blueeyes: com.mongodb.MongoException: not talking to master and retries used up
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:227)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:229)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:229)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:295)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBCursor._check(DBCursor.java:354)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBCursor._hasNext(DBCursor.java:484)
ERR [20110512-15:47:43.401] blueeyes: at com.mongodb.DBCursor.hasNext(DBCursor.java:509)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.RealDatabaseCollection$$anon$1.hasNext(RealMongoImplementation.scala:82)
ERR [20110512-15:47:43.401] blueeyes: at scala.collection.IterableLike$class.isEmpty(IterableLike.scala:92)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.IterableViewImpl.isEmpty(RealMongoImplementation.scala:122)
ERR [20110512-15:47:43.401] blueeyes: at scala.collection.TraversableLike$class.headOption(TraversableLike.scala:483)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.IterableViewImpl.headOption(RealMongoImplementation.scala:122)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.QueryBehaviours$SelectOneQueryBehaviour$class.query(QueryBehaviours.scala:128)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoSelectOneQuery.query(MongoQuery.scala:71)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoSelectOneQuery.query(MongoQuery.scala:71)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.QueryBehaviours$MongoQueryBehaviour$class.apply(QueryBehaviours.scala:16)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoSelectOneQuery.apply(MongoQuery.scala:71)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoSelectOneQuery.apply(MongoQuery.scala:71)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoDatabase$$anonfun$blueeyes$persistence$mongo$MongoDatabase$$mongoActor$1.apply(Mongo.scala:46)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.persistence.mongo.MongoDatabase$$anonfun$blueeyes$persistence$mongo$MongoDatabase$$mongoActor$1.apply(Mongo.scala:43)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ActorExecutionStrategySequential$$anon$14$$anonfun$submit$1.apply(Actor.scala:17)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.Future$$anonfun$deliver$1.apply(Future.scala:50)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.Future$$anonfun$deliver$1.apply(Future.scala:46)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ReadWriteLock$class.writeLock(ReadWriteLock.scala:10)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.Future$$anon$1.writeLock(Future.scala:22)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.Future.deliver(Future.scala:45)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ActorExecutionStrategySequential$$anon$14.submit(Actor.scala:17)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ActorImplementationMultiThreaded$StrategyWorker$$anonfun$run$1.apply$mcV$sp(Actor.scala:195)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ActorImplementationMultiThreaded$ActorContext$.withActorFn(Actor.scala:230)
ERR [20110512-15:47:43.401] blueeyes: at blueeyes.concurrent.ActorImplementationMultiThreaded$StrategyWorker.run(Actor.scala:189)
ERR [20110512-15:47:43.401] blueeyes: (...more...)

This may be related to https://jira.mongodb.org/browse/SERVER-3087



 Comments   
Comment by Marius Seritan [ 13/Jun/12 ]

I apologize this was a mistake on our side. A combination of configuration changes and using mongo driver in single mode lead to us sending write requests to the slave in one of the components.

Sorry for the noise.

Comment by Marius Seritan [ 13/Jun/12 ]

Problem is still occuring after updating all the servers in the replica set to 2.0.6.

It turns out we are using mongo-java-driver-2.6.5.jar, will try to update to the latest.

Comment by Marius Seritan [ 13/Jun/12 ]

I am getting these errors with just sone of the importers running. See the rs.status() below. The master we want to use is us-east1b and I ran the status command on that master. I am not sure if syncingTo is correct.

One more detail that may be criticak to mention: us-east1b and us-east3 run mongodb 2.0.5, us-west1 runs mongodb 2.0.6 since yesterday. Can this be the cause of these problems? I will upgrade now the other servers.

PRIMARY> rs.status()
{
	"set" : "xxxxxxxxxx",
	"date" : ISODate("2012-06-13T16:38:06Z"),
	"myState" : 1,
	"syncingTo" : "us-east3:27019",
	"members" : [
		{
			"_id" : 0,
			"name" : "us-east1b:27019",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"optime" : {
				"t" : 1339605462000,
				"i" : 3
			},
			"optimeDate" : ISODate("2012-06-13T16:37:42Z"),
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "us-east3:27019",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 1026471,
			"optime" : {
				"t" : 1339605462000,
				"i" : 3
			},
			"optimeDate" : ISODate("2012-06-13T16:37:42Z"),
			"lastHeartbeat" : ISODate("2012-06-13T16:38:05Z"),
			"pingMs" : 1
		},
		{
			"_id" : 2,
			"name" : "us-west1:27019",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 84970,
			"optime" : {
				"t" : 1339605462000,
				"i" : 3
			},
			"optimeDate" : ISODate("2012-06-13T16:37:42Z"),
			"lastHeartbeat" : ISODate("2012-06-13T16:38:04Z"),
			"pingMs" : 85
		}
	],
	"ok" : 1
}

Comment by Marius Seritan [ 13/Jun/12 ]

I am also getting this error today. I am not aware of any change in the replica set master. I do not see errors in the logs.

Some more data points, may or may not be relevant:

  • there were heavy writes in the system from two import processes that last hours
  • I have not noticed these problems while running just one of the importers
  • the replica set was hundreds of seconds behind.
  • we import data in a temporary collection, 3 million records, and then we rename the collection.
  • at one point the mongod master process was using 100% CPU

com.mongodb.MongoException: not talking to master and retries used up
	at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:229) 
	at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:231) 
	at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:231) 
	at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:303) 
	at com.mongodb.DBCollection.findOne(DBCollection.java:565) 
	at com.mongodb.DBCollection.findOne(DBCollection.java:554) 

Comment by Scott Hernandez (Inactive) [ 12/Jun/12 ]

We should print the replset status with the exception for better debuging

Comment by Jeffrey Yemin [ 12/Jun/12 ]

@Nic, is this a three node replica set? If so, and two of the nodes were down, there wouldn't have been a master, and the driver would not be able to write or read (unless using slaveOk).

Comment by Nic Cottrell (Personal) [ 12/Jun/12 ]

I just got this with server 2.0.5 and java client drivers 2.7.3. The server was part of a replica set, but the arbiter and other node have been down for weeks and this was the first time I saw this message.

Comment by Eliot Horowitz (Inactive) [ 08/Dec/11 ]

Are you sure there is a mater currently?

Comment by Amit [ 08/Dec/11 ]

Any workaround / patch for this issue ?

Facing the same issue with a replica set of 2 DB + 1 arbiter
driver 2.7.2
Mongo 2.0

Comment by Nic Cottrell (Personal) [ 05/Nov/11 ]

I just got this too:

com.mongodb.MongoException: not talking to master and retries used up
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:229)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:231)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:231)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:303)
at com.mongodb.DBCursor._check(DBCursor.java:360)
at com.mongodb.DBCursor._hasNext(DBCursor.java:490)
at com.mongodb.DBCursor.hasNext(DBCursor.java:515)
at com.google.code.morphia.query.MorphiaIterator.hasNext(MorphiaIterator.java:40)
at com.google.code.morphia.query.QueryImpl.asKeyList(QueryImpl.java:273)
at com.google.code.morphia.mapping.ReferenceMapper.exists(ReferenceMapper.java:258)
at com.google.code.morphia.mapping.ReferenceMapper.readSingle(ReferenceMapper.java:159)
at com.google.code.morphia.mapping.ReferenceMapper.fromDBObject(ReferenceMapper.java:145)
at com.google.code.morphia.mapping.Mapper.readMappedField(Mapper.java:505)
at com.google.code.morphia.mapping.Mapper.fromDb(Mapper.java:484)

But I am running a single instance server with journalling... not replicas at all.

Generated at Thu Feb 08 08:52:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.