[SERVER-57521] FCV change unconditionally closes outgoing connections that have not finished their hello handshake Created: 08/Jun/21  Updated: 29/Oct/23  Resolved: 29/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.2, 5.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Janna Golden
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File setfcv_conndrop_repro.js    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-57202 _flushReshardingStateChange can fail ... Closed
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0
Sprint: Sharding 2021-06-28
Participants:
Linked BF Score: 50

 Description   

Upon changing the FCV value, outgoing connections to servers with a lower binary version are closed. This is done by closing all the connections except those tagged with transport::Session::kKeepOpen.
However, there exists a race condition between setting the kKeepOpen flag during the hello handshake, and deciding whether to close the connection by looking at the tags.
As a result, connections that are in the process of being established can be incorrectly dropped during setFCV.

ConnectionPool::dropConnections is not considering the kPending flag.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 21/Jul/21 ]

Author:

{'name': 'jannaerin', 'email': 'golden.janna@gmail.com', 'username': 'jannaerin'}

Message: SERVER-57521 Avoid dropping connections with pending flag set, unset pending flag if binary versions incompatible

(cherry picked from commit 39505432404a35ebd35e05dffec0ad29d7f79262)
Branch: v5.0
https://github.com/mongodb/mongo/commit/803c423fd5d055b4e3c82e645273800998dc2362

Comment by Githook User [ 29/Jun/21 ]

Author:

{'name': 'jannaerin', 'email': 'golden.janna@gmail.com', 'username': 'jannaerin'}

Message: SERVER-57521 Avoid dropping connections with pending flag set, unset pending flag if binary versions incompatible
Branch: master
https://github.com/mongodb/mongo/commit/39505432404a35ebd35e05dffec0ad29d7f79262

Comment by Jordi Serra Torrens [ 11/Jun/21 ]

AsyncDBClient::_parseIsMasterResponse does check the wire version prior to setting the kKeepOpen flag, but this can race. In fact, I think a race condition already exists where:
1. Let's say the local server has minWireVersion 9
2. [th1] A connection is being established to a remote server with wire version 9. We have passed the wire version check here.
3. [th2] setFCV runs and advances minWireVersion to 10
4. [th1] the kKeepOpen flag will be set for the connection, although the wire version is no longer compatible
5. [th2] the connection won't be closed because it has been tagged with kKeepOpen

So either this has to be safe, or I think we already have a bug there. Someone from repl or service arch may be more familiar with this code to confirm this.

Comment by Matthew Saltz (Inactive) [ 10/Jun/21 ]

Is it a problem if we don’t drop a pending connection and then it turns out to be a connection that’s not marked “kKeepOpen” and is also to an older binary version?

Comment by Janna Golden [ 10/Jun/21 ]

I have a repro of this issue that I've attached to this ticket. Adding a check for `kPending` in ConnectionPool::dropConnections fixes the issue. I can take this ticket and put up a CR if we agree that adding a check for `kPending` in `ConnectionPool:dropConnections` is the right fix.

Generated at Thu Feb 08 05:42:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.