-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 1.5.1
-
Component/s: None
-
None
-
Environment:Mongodb 2.0.1, Ruby driver 1.5.1.
When we apply the recovery procedure desrcribed here - https://github.com/mongodb/mongo-ruby-driver/blob/master/docs/REPLICA_SETS.md - and the primary node goes down, our application hangs. We could reproduce the problem with a small sample script.
The following script connects to two of the replicas, named 'replicaone' and 'replicatwo', issues a 'count' for a given collection, waits for user input, and issues the count again.
The following steps reproduce the problem, supposing 'replicaone' is the primary (verified my opening a shell on the machine):
- Run the script until it waits for user input
- Shutdown the 'replicaone' node
- Press enter to let the script continue to run
- Then it prints "Retrying" once and just hangs.
We tried setting "connect_timeout", "op_timeout", and even "pool_timeout" to very short periods to no avail.
Here's the code:
sample.rb
#!/usr/bin/ruby require 'rubygems' require 'mongo' @conn = Mongo::ReplSetConnection.new(['replicaone', 27017], ['replicatwo', 27017], :connect_timeout => 2, :op_timeout => 2) @db = @conn['database'] #Copied verbating from the URL above, just inserted an extra "puts" def rescue_connection_failure(max_retries=60) retries = 0 begin yield rescue Mongo::ConnectionFailure => ex retries += 1 raise ex if retries > max_retries sleep(0.5) puts "Retrying" retry end end rescue_connection_failure do puts "There are #{@db['collection'].count} records." gets puts "There are #{@db['collection'].count} records." end