[SERVER-835] Down replica pair with persistent connection causes hang Created: 24/Mar/10  Updated: 26/Apr/10  Resolved: 26/Mar/10

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 1.2.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Mytton Assignee: Kristina Chodorow (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

PHP 5.2.13 Red Hat EL 5
Linux oxford.boxedice.net 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
Mongo PHP module 1.0.3 (Git e70b8e5)


Participants:

 Description   

1) Connect to a replica pair setup using a persistent connection from PHP

2) Take one of the pair instances down (in our case we took down the slave)

3) All httpd connections hang for a period of time. I assume this is caused by the timeout on the connection?

This seems to recur every few minutes, although the hang is not as long as the first time, the downed replica pair still causes connections to hang



 Comments   
Comment by Eliot Horowitz (Inactive) [ 26/Mar/10 ]

See SERVER-832

Comment by Kristina Chodorow (Inactive) [ 26/Mar/10 ]

Taking down a slave should have 0 effect on the PHP driver, so moving this to SERVER.

Comment by David Mytton [ 25/Mar/10 ]

I didn't have a chance to run nay commands from the shell during the lag.

Comment by Kristina Chodorow (Inactive) [ 25/Mar/10 ]

Makes sense! Have you happened to notice if commands at the shell have the same lag?

Comment by David Mytton [ 24/Mar/10 ]

We're using Mongo 1.2.4. I think there's something going on when I kill the slave because all requests to httpd hang for 30 seconds or so, and then are continually slow after that initial hang. We're serving ~15 req/s but it's hard to replicate because it takes our app down, which I don't like doing

Comment by Kristina Chodorow (Inactive) [ 24/Mar/10 ]

I'm having trouble reproducing this. Can you tell me what, if anything, is different in what you're doing?

I'm starting two servers, no arbiter:

$ ./mongod --pairwith localhost:27018 --dbpath ~/data1
$ ./mongod --port 27018 --pairwith localhost:27017 -dbpath ~/data2

I'm running the following PHP script and alternating killing the dbs:

<?php

$m = new Mongo("mongodb://localhost:27017,localhost:27018", array("persist" => "foo"));

$c = $m->foo->bar;

while (true) {
echo "finding... ";
try

{ $c->findOne(); }

catch (Exception $e)

{ echo $e->getMessage(); }

echo "\n";

sleep(1);
}

?>

Taking down a slave shouldn't affect the client at all, it should just be using the master. What version of the DB are you using?

Generated at Thu Feb 08 02:55:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.