[SERVER-27084] Retry reconnection after stepdown in multi_rs.js Created: 17/Nov/16  Updated: 06/Dec/22  Resolved: 05/Jan/17

Status: Closed
Project: Core Server
Component/s: Replication, Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Siyuan Zhou Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-25831 Wait for secondary state before retur... Closed
Assigned Teams:
Replication
Backport Requested:
v3.4, v3.2
Participants:
Linked BF Score: 19

 Description   

Reconnected connection could be closed in 3.2. We need to retry to reconnect().



 Comments   
Comment by Spencer Jackson [ 04/Jan/17 ]

I'm returning this ticket to the replication backlog, as there is some uncertainty at the moment that this is the best fix for the problem.

Comment by Siyuan Zhou [ 17/Nov/16 ]

The problem is the reconnection is closed during primary stepdown. We have only seen on 3.2. In both 3.2 and 3.4, closing and accepting connections are protected by a mutex, so the reconnection should be accepted after closing all connections and be safe. It's not clear why this would happen and the network layer has changed a lot since 3.2.

Comment by Spencer Brody (Inactive) [ 17/Nov/16 ]

Does this only affect 3.2 or also 3.4 and master? Since it's not an internal connection I'd expect the behavior to be the same in all versions.

Generated at Thu Feb 08 04:14:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.