[SERVER-54454] Use same connection pool for heartbeats and oplog fetching Created: 10/Feb/21  Updated: 10/Apr/23

Status: Backlog
Project: Core Server
Component/s: Networking, Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Judah Schvimer Assignee: Backlog - Replication Team
Resolution: Unresolved Votes: 2
Labels: former-quick-wins
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Replication
Sprint: Replication 2021-11-15
Participants:
Case:

 Description   

Using different ones leaves us open to a case where one pool exhausts its connections and the other does not. It also probably creates more connections than we need.
Oplog fetching, heartbeats.



 Comments   
Comment by Andy Schwerin [ 01/Jun/21 ]

I think there are two routes to achieving this goal. (1) is the ticket's suggested solution, using the same connection pool for heartbeats and oplog fetching. Since we're primarily concerned with heartbeats failing when oplog fetching has begun to fail, we could either (1a) use the same pool, or (1b) have the DBClient used for oplog fetching remove a working connection from the pool used by heartbeating. (2) When oplog fetching has failed to establish a connection to the upstream node successfully, it could flush all connections out of the connection pool used for heartbeats, so that the next heartbeat will be required to establish a new connection.

Comment by Judah Schvimer [ 10/Feb/21 ]

dmitry.agranat would like to backport this to 4.0, which we should consider after implementation. If the backport is too difficult, or if we are not going to do this ticket soon, we should update the arch guide with a warning that there are two connection pools.

Generated at Thu Feb 08 05:33:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.