Problem statement
When a mongos is initially brought up, it lacks connections to any of the cluster’s shard servers. This can add latency to user operations, as we require a request to a host before we start spooling a connections (even in the presence of minPoolSize).
Proposal
Provide a new startup parameter, which would cause mongos wait a configurable amount of time to establish at least minPoolSize connections to each shard server before accepting user connections. This would allow us to reboot mongos’, or add new ones, without making any initial user requests pay the overhead of waiting on connection establishment.
This should should be relatively simple. Cribbing from the test only mongos command: multicast
- Temporarily turn the host timeout to infinity
- Fetch all shard ids
- Foreach shard id
- Get the connection string
- Foreach host in the connection string
- Push the host onto a list
- Use the AsyncMulticaster, with the arbitrary executor, to multicast a ping to all hosts
- Wait for a configurable delay, periodically checking connpoolstats to see if hosts all have minPoolSize connections
- If all pools reach the desired size before the timeout, stop waiting
- If all pools don’t, continue at the end of the timeout
- Return the host timeout to its former value
- Start accepting client connections
Steps 1 and 7 may also be optional (if we’re waiting 10 seconds, the 5 minute host timeout may not matter).
In either case, the work should be relatively approachable, as it only involves being a regular client of the networking layer in mongos, rather than doing any development against the interior components.
- causes
-
SERVER-60344 Action plan on lagging setFCV replicas breaking tests
- Closed
- related to
-
SERVER-47169 Sharding initialization contacts config shard before ShardRegistry updated by RSM, preventing mongos from starting up
- Closed
-
SERVER-59941 Refactor pre-warm connection pools in mongos to use process_health machinery
- Backlog