[SERVER-27084] Retry reconnection after stepdown in multi_rs.js Created: 17/Nov/16 Updated: 06/Dec/22 Resolved: 05/Jan/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | Backlog - Replication Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Replication
|
||||||||||||||||
| Backport Requested: |
v3.4, v3.2
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 19 | ||||||||||||||||
| Description |
|
Reconnected connection could be closed in 3.2. We need to retry to reconnect(). |
| Comments |
| Comment by Spencer Jackson [ 04/Jan/17 ] |
|
I'm returning this ticket to the replication backlog, as there is some uncertainty at the moment that this is the best fix for the problem. |
| Comment by Siyuan Zhou [ 17/Nov/16 ] |
|
The problem is the reconnection is closed during primary stepdown. We have only seen on 3.2. In both 3.2 and 3.4, closing and accepting connections are protected by a mutex, so the reconnection should be accepted after closing all connections and be safe. It's not clear why this would happen and the network layer has changed a lot since 3.2. |
| Comment by Spencer Brody (Inactive) [ 17/Nov/16 ] |
|
Does this only affect 3.2 or also 3.4 and master? Since it's not an internal connection I'd expect the behavior to be the same in all versions. |