[GODRIVER-1589] Consolidate pool.drain() and pool.clear() Created: 22/Apr/20  Updated: 28/Oct/23  Resolved: 12/May/20

Status: Closed
Project: Go Driver
Component/s: Monitoring
Affects Version/s: 1.3.1
Fix Version/s: 1.3.4

Type: Bug Priority: Major - P3
Reporter: Maxime Jimenez Assignee: Isabella Siu (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

We have separate functions for pool.drain() and pool.clear(), and pool.drain() does not emit a poolCleared event.  These should be the same function and should emit the event.



 Comments   
Comment by Divjot Arora (Inactive) [ 12/May/20 ]

Hi maxime.jimenez@kapten.com,

We've made the code changes for the bug I described in a previous comment. When this is released, you should be able to modify your PoolMonitor to look for PoolCleared events, which indicate that there was a non-transient error that caused the driver to clear out the connection pool for a server.

As far as the long delay for ConnectionClosed events, expired connections are closed by a goroutine which iterates the connection pool every minute to check which connections are invalid. If the routine has just run, you could wait up to a minute for the next sweep. As I mentioned before, I don't think ConnectionClosed events are relevant to your use case. If you need to programmatically detect a server shutdown, PoolCleared events are probably the best way to go. However, note that depending on your version of MongoDB, this could provide false positives because older server versions close connections when the primary becomes a secondary. Overall, though, the driver has internal mechanisms to detect and recover from these scenarios, so you might not need any manual detection as I mentioned earlier.

– Divjot

Comment by Githook User [ 11/May/20 ]

Author:

{'name': 'iwysiu', 'email': 'isabella.siu@10gen.com', 'username': 'iwysiu'}

Message: GODRIVER-1589 fix data race in test (#404)
Branch: release/1.3
https://github.com/mongodb/mongo-go-driver/commit/a6517472f815a0520562711756c7f726640fd755

Comment by Githook User [ 11/May/20 ]

Author:

{'name': 'iwysiu', 'email': 'isabella.siu@10gen.com', 'username': 'iwysiu'}

Message: GODRIVER-1589 consolidate pool.drain and pool.clear (#397)
Branch: release/1.3
https://github.com/mongodb/mongo-go-driver/commit/6e4f4c5a2d563abb6871eb93cb79edf4ccaab8b4

Comment by Githook User [ 11/May/20 ]

Author:

{'name': 'iwysiu', 'email': 'isabella.siu@10gen.com', 'username': 'iwysiu'}

Message: GODRIVER-1589 fix data race in test (#404)
Branch: master
https://github.com/mongodb/mongo-go-driver/commit/f1f16a1f4d769d844812278841a184ae7f301732

Comment by Githook User [ 08/May/20 ]

Author:

{'name': 'iwysiu', 'email': 'isabella.siu@10gen.com', 'username': 'iwysiu'}

Message: GODRIVER-1589 consolidate pool.drain and pool.clear (#397)
Branch: master
https://github.com/mongodb/mongo-go-driver/commit/a7e237231f5c8de52f0af3036facca0e66cf2fc4

Comment by Divjot Arora (Inactive) [ 23/Apr/20 ]

Thanks for the feedback. I think we have enough information to triage this issue. I'm hoping that we can use this ticket to fix the bug so that PoolCleared events are correctly published and also to update the documentation for PoolMonitor and provide guidance on when specific events are published. Our triage meetings are on Mondays, so I will discuss this ticket with the team at that time.

– Divjot

Comment by Maxime Jimenez [ 23/Apr/20 ]

Thank you for the answer, I'll check on my side if it's ok and if the server comes back up on the same address.

About point 2) yes I don't understand why we don't receive any events before a "long" delay (when not performing request, for instance a web server does not receive requests that will triggers mongodb queries) while when we are performing queries, it is showing up immediately. I don't have much knowledge on how this is handled internally on the driver side though.

 

Apart from this, I feel like the documentation of the driver is mentioning "background checks" and "background goroutines" but I feel like it is hard to see the big picture in it and if the HeartbeatInterval is linked to this "background goroutine" and to "PoolMonitor" calls. Maybe it could be worth expliciting what is linked and what is not in it.

 

Maxime

Comment by Divjot Arora (Inactive) [ 23/Apr/20 ]

Two things:

  1. As long as the server comes back up at the same address, there's no action needed on your end to be resilient to restarts. The driver will re-discover the server once it's back up. In the mean time, requests will not be routed to that server.
  2. If you do want to check for server disconnection, you can probably look for PoolCleared events instead of ConnectionClosed. The background monitoring routine should clear the connection pool for the server if there is a connection error during monitoring, which would occur if the server has shutdown. However, I think there may be a bug in this code path that prevents the event from being published, so we'll triage that bug in our meeting next Monday. I'm hoping to fix that bug under this ticket.

– Divjot

Comment by Maxime Jimenez [ 23/Apr/20 ]

Yes, the use case is that we want to be able to make a os.Exit() when the server has shutdown.

The reason why we want to do this is that we will be using MongoAtlas that is managing upgrades of mongodb.

To do so, servers are restarted without warning.

Because of this we must be resilient to a server-side disconnection, and we want to rely on the fact that if we kill our process, it will be restarted automatically connecting to the new server that will be up.

Let me know if this is not clear.

 

Comment by Divjot Arora (Inactive) [ 23/Apr/20 ]

Can you explain your use case? Are you trying to programmatically tell when the server has shut down? Knowing this would be useful for us when making suggestions.

Comment by Maxime Jimenez [ 23/Apr/20 ]

Hi Divjot Arora,

Thanks for the fast answer.

We are using the driver this way :

func NewMongoPoolEventListener(onFailure func()) func(event *mongoEvent.PoolEvent) {
    return func(event *mongoEvent.PoolEvent) {
        if event.Type == mongoEvent.ConnectionClosed {
            onFailure()
        }
    }
}

// Configuration
poolMonitor := mongoEvent.PoolMonitor{        
    Event: NewMongoPoolEventListener(onFailure),
}
clientOptions := options.Client().ApplyURI("mongodb://localhost:27017/my-database")
clientOptions.SetPoolMonitor(&poolMonitor)
clientOptions.SetMinPoolSize(uint64(1)) // After your suggestion, which seems to work
 
// Connection
 
client, err := mongo.Connect(ctx, clientOptions)

While what you suggested seems to work in the end, thank you for that 
I am noticing that I receive the close event after more than 45 seconds after Mongodb stopped.
Would the ClientOptions.SetHeartbeatInterval() help in this case ?

In the doc, it is stated that :
SetHeartbeatInterval specifies the amount of time to wait between periodic background server checks.

 

Is that the periodic background that is mentioned in Connect() doc as The Client.Connect method starts background goroutines to monitor the state of the deployment?

 

Is this the same periodic background goroutine that is used for PoolMonitor ?

Thanks
Maxime

Comment by Divjot Arora (Inactive) [ 22/Apr/20 ]

Hi maxime.jimenez@kapten.com,

Thank you for the report. Can you provide some more information so we can accurately triage this ticket? What is your monitor configured to do? Is it only looking for ConnectionClosed events?

If you initialize a mongo.Client with the default settings, run no operations, and then call Disconnect, I think this behavior is expected. This is because the first connection is created when the first operation is executed. You can change this by using the ClientOptions.SetMinPoolSize option, which will populate the pool in the background.

– Divjot

Generated at Thu Feb 08 08:36:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.