[SERVER-1929] handle replica set flapping Created: 12/Oct/10 Updated: 12/Jul/16 Resolved: 09/Oct/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 2.3.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Dwight Merriman | Assignee: | Kristina Chodorow (Inactive) |
| Resolution: | Done | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Participants: | |||||||||||||
| Description |
|
If tcp connections quickly flap, we should not fail over to secondary. We should make sure this doesn't happen at least in the trivial case where a single reconnect try works just fine after a socket exception. please research; maybe we can have a test for this too - we could add an option to replSetTest command to close all connections - something like MessagingPort::closeAllSockets(0); might work for testing. |
| Comments |
| Comment by auto [ 12/Oct/12 ] |
|
Author: {u'date': u'2012-10-12T10:54:33-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: |
| Comment by auto [ 11/Oct/12 ] |
|
Author: {u'date': u'2012-10-11T12:14:18-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: |
| Comment by auto [ 11/Oct/12 ] |
|
Author: {u'date': u'2012-10-11T09:04:28-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: |
| Comment by auto [ 04/Oct/12 ] |
|
Author: {u'date': u'2012-10-04T15:56:46-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: |
| Comment by auto [ 04/Oct/12 ] |
|
Author: {u'date': u'2012-10-04T12:18:29-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: Fixed test because stepdown is so much faster that the connection is dead |
| Comment by auto [ 04/Oct/12 ] |
|
Author: {u'date': u'2012-10-04T08:59:31-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: |
| Comment by auto [ 21/Sep/12 ] |
|
Author: {u'date': u'2012-09-21T09:24:03-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: Add heartbeat timeout setting |
| Comment by auto [ 14/Sep/12 ] |
|
Author: {u'date': u'2012-09-14T11:48:26-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: Track heartbeats received for health |
| Comment by auto [ 05/Sep/12 ] |
|
Author: {u'date': u'2012-09-05T13:58:34-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: Check pointer before dereferencing |
| Comment by auto [ 05/Sep/12 ] |
|
Author: {u'date': u'2012-09-05T12:18:18-07:00', u'email': u'kristina@10gen.com', u'name': u'Kristina'}Message: Allow socket timeout to be set after connecting |
| Comment by Richard Kreuter (Inactive) [ 16/Jul/12 ] |
|
Kristina tells me this is the issue she considers canonical for changing stuff about RS heartbeats and things. Making RS failovers not happen unless necessary ought to be a major issue. |