[COMPASS-6103] Investigate changes in PM-2765: Refactor and Improve ConnectionPool Created: 06/Sep/22  Updated: 19/Sep/22  Resolved: 19/Sep/22

Status: Closed
Project: Compass
Component/s: None
Affects Version/s: None
Fix Version/s: No version

Type: Investigation Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Documentation Changes: Not Needed

 Description   
Original Downstream Change Summary

Possible changes to FTDC/serverStatus.

Description of Linked Ticket

Epic Summary

Summary

We should refactor and reorganize the ConnectionPool implementation to remove unnecessary abstractions. This would help with understanding and maintaining the code. Allocating a dedicated executor thread for each ConnectionPool can also make its behavior more predictable (e.g., make many synchronizations unnecessary). We may also evaluate the possibility and benefits of replacing many connection pools, which is the current design, with a single pool, capable of providing similar guarantees for egress connections. Finally, we should add more diagnostics to the pool (e.g., running averages for the duration of creating new, refreshing existing, and returning checked-out connections) and create a section in FTDC that reports these metrics for all existing instances of ConnectionPool. This improvement helps with investigating incidents similar to HELP-27338 and is aligned with making the code-base more maintainable and easier to debug.

Motivation

Managing egress connections (e.g., creating new and refreshing existing connections) is implemented through several layers of abstraction, such as ConnectionPool, SpecificPool, TLConnection, and TLTypeFactory. These types internally interact to maintain a set of connection pools, and rely on an external executor (i.e., the networking reactor thread) for housekeeping. With respect to diagnostics, each pool only reports the aggregated number of created, available, refreshing, refreshed, and in-use connections. Furthermore, we only record these metrics on Mongos and for the connection pools owned by the ShardingTaskExecutor.

Documentation

Product Description
Scope Document
Technical Design Document



 Comments   
Comment by Anna Henningsen [ 19/Sep/22 ]

No impact

Generated at Wed Feb 07 22:41:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.