[SERVER-63045] Drop pooled connections to nodes no longer in sharded cluster after a topology change Created: 27/Jan/22  Updated: 12/Dec/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: George Wangensteen Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Participants:
Story Points: 4

 Description   

After a repl set reconfig, drop pooled connections to the removed node.

When a server leaves a sharded cluster, drop pooled connections to and from the removed server. 

Right now, if a server is removed, hostname resolution is changed, and the server is added back with a new IP, pooled connections to the server that performed hostname resolution before the removal will be broken. The cluster will encounter errors when attempting to use those connections.

Worse, if there is now a different MongoDB server at the original IP, then connections that were pooled before the removal will think they point at one MongoDB server when they really point to a different one (e.g. consider serverA with hostNameA. Originally, hostnameA resolves to IP_A. Then serverA is removed, and hostname resolution is changed so that hostnameA points to IP_B. Then serverA is re-added to the cluster, and serverB is started at IP_A. Connections that were pooled to serverA before it was removed will have already resolved hostNameA to IP_A, which is incorrect after it is removed. They will actually be talking to serverB). 

Currently, we need to either wait for connection pools to age out or manually issue a dropConnections command on all cluster members to avoid this issue. If a user doesn't do this, they will encounter the above issues. We should automatically drop connections to and from removed nodes when we notice a topology change. 

 

Note that SERVER-36417 completed this work for replica sets; i.e. nodes in a replica set that are removed from the config will have connections dropped as part of the reconfig process. This ticket tracks completing the work for sharded clusters/connections outside replica sets (i.e.shard to shard and shard to mongos) 

 



 Comments   
Comment by Ratika Gandhi [ 28/Jan/22 ]

garaudy.etienne would like to know the level of effort for this ticket lamont.nelson.

Generated at Thu Feb 08 05:56:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.