[GODRIVER-1598] Connection pool events should return the number of active connections Created: 29/Apr/20 Updated: 25/Aug/23 Resolved: 09/Jun/20 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | Connections |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | AJ Ortega | Assignee: | Divjot Arora (Inactive) |
| Resolution: | Declined | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Connection pool events should return the number of connections. Right now there's no way for clients to get visibility into the number of active connections in the go driver. |
| Comments |
| Comment by Divjot Arora (Inactive) [ 09/Jun/20 ] | ||||||||||||||||||||||||||||
|
I'm moving this to "Closed" because we haven't heard anything and the issue of ConnectionClosed events not being published will be addressed in | ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 08/May/20 ] | ||||||||||||||||||||||||||||
|
I have tried to do this on my own. I noticed that there was an issue with ConnectionClosed events being missed when the pool is being disconnected and have filed I didn't see any issues with ConnectionCreated events though. One thing I noted from your logs is the ConnectionCreatedEvent reports a connection count of 18446744073709551612. The first ConnectionClosed event shows count 18446744073709551611, which makes sense. The next four ConnectionClosed events actually show an increasing count (18446744073709551612 - 18446744073709551615). That suggests that the atomic is being incremented, which would only happen in your monitor if a ConnectionCreated event was published. Any idea what's going on there? Also, those numbers are very high. Are that many connections actually being created or is this an overflow/underflow issue? | ||||||||||||||||||||||||||||
| Comment by AJ Ortega [ 07/May/20 ] | ||||||||||||||||||||||||||||
|
Have you tried to make your own connection counter and verify that it works? I also audited the code, but my connection counts are clearly incorrect. | ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 07/May/20 ] | ||||||||||||||||||||||||||||
|
ajo@stripe.com I haven't had any luck figuring out why ConnectionCreated events wouldn't be firing. Connections are either created in a background pooling routine if MinPoolSize is specified or during an application operation if the pool is empty. In both cases, the code goes through topology/pool.makeNewConnection (https://github.com/mongodb/mongo-go-driver/blob/master/x/mongo/driver/topology/pool.go#L270), which fires a ConnectionCreatedEvent as long as newConnection succeeds. Note that newConnection only creates an in-memory type and does not actually do the network I/O needed to handshake/authenticate the connection, so it would only fail if the connection were mis-configured, which is very unlikely. If that were the case, no connections would be created successfully because they all use the same configuration. Do you have any more information that might be helpful in debugging this issue? | ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 06/May/20 ] | ||||||||||||||||||||||||||||
|
Thanks for the extra information ajo@stripe.com. If ConnectionCreated events are not being published, that seems like a bug. There should be one ConnectionCreated event per ConnectionClosed event. I'll investigate this a bit more tomorrow. Can you provide the driver version you're using? | ||||||||||||||||||||||||||||
| Comment by AJ Ortega [ 05/May/20 ] | ||||||||||||||||||||||||||||
|
A quick update: we attempted to use the connection pool events as described, but there seem to be missing ConnectionCreated events, preventing an accurate count. Our monitor looks like the following:
We're using the topology interface directly as follows:
Specifically, this falls over in the face of idle connections:
The code for managing this counter is further complicated by the fact that multiple threads could be emitting these events in parallel, necessitating the need for an atomic counter. The len() value that the library has access to in its pointer receivers would be able to provide more accurate information without the need for an additional atomic operation.
| ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 30/Apr/20 ] | ||||||||||||||||||||||||||||
|
Ok, I'd like to summarize everything I've gathered from the comments to make sure we're on the same page. For simplicity, all of these comments assume that there's a single mongod server running as a standalone.
| ||||||||||||||||||||||||||||
| Comment by AJ Ortega [ 30/Apr/20 ] | ||||||||||||||||||||||||||||
|
Even that map won't help, because we may have different copies of the driver talking to the same node, which will each have their own connection pool. That's why I wanted to get the information from the source of truth rather than trying to reconstruct it out-of-band. | ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 30/Apr/20 ] | ||||||||||||||||||||||||||||
|
It seems like the ConnnectionID field is a per-pool incrementing integer that gets assigned to a connection in pool.makeNewConnection. However, this probably won't help for your use case because it doesn't seem like connection IDs are re-used after a connection is closed, likely for performance reasons. I'll talk over a solution with the rest of the team and post here once we have some more ideas. From what I can tell, it doesn't seem like there's an easy way to get the number of connections per-node, except for keeping a map from server address to connection count in the pool monitor. – Divjot | ||||||||||||||||||||||||||||
| Comment by AJ Ortega [ 30/Apr/20 ] | ||||||||||||||||||||||||||||
|
This should be marked as an improvement, not a bug. We are interested in the number of connections per node, which will allow us to find clients that are creating too many connections. Since there's a semaphore per-node, I didn't want to introduce any further contention by attempting to figure out how many connections the other nodes have. It's possible to figure out the number of the connections in the driver overall after the fact by looking through logs and performing analysis, or by aggregating data in a data structure provided to the pool event monitor.
Our use case involves many replsets, so a solution around keeping a counter would require us to pass state to the monitor function as a captured variable, and is much more complex to maintain that a quick call to len() while the lock is already held.
I agree that it could be confusing to users not familiar with the internal workings of the driver. Do you think a different field name would be clearer? The driver already reports the per-backend connection ID, which I haven't been able to find a use case for in our client. | ||||||||||||||||||||||||||||
| Comment by Divjot Arora (Inactive) [ 30/Apr/20 ] | ||||||||||||||||||||||||||||
|
ajo@stripe.com Do you have a specific reason for filing this as a bug? Adding this information would require adding a new field to the PoolEvent struct, which seems more like an improvement than a bug. I looked over the PR you sent for this, and I'm not sure it's correct because it reports the size of the pool that emits the event. However, there is one connection pool per node in the cluster (i.e. one per topology.Server instance), so that's not going to accurately report the number of connections in the driver. That number will also change depending on which pool is emitting the event, as they all share the same monitor. As a workaround for the time being, you can keep a counter which gets incremented on ConnectionOpened events and decremented on ConnectionClosed events to track how many connections have been opened by the pool. |