[GODRIVER-73] Panic from cluster monitor closing channel during Stop Created: 31/Aug/17  Updated: 28/Oct/23  Resolved: 11/Sep/17

Status: Closed
Project: Go Driver
Component/s: Monitoring
Affects Version/s: None
Fix Version/s: 0.0.1

Type: Bug Priority: Critical - P2
Reporter: Adinoyi Omuya Assignee: Craig Wilson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related

 Description   

When i start it with ./bin/mongosqld -vvv, I get this panic after sampling is completed:

panic: send on closed channel
 
goroutine 36 [running]:
github.com/10gen/sqlproxy/vendor/github.com/10gen/mongo-go-driver/cluster.(*Monitor).startMonitoringServer.func1(0xc420138240, 0xc420138120)
    /Users/wisdom/gopath/src/github.com/10gen/sqlproxy/vendor/github.com/10gen/mongo-go-driver/cluster/monitor.go:188 +0x92
created by github.com/10gen/sqlproxy/vendor/github.com/10gen/mongo-go-driver/cluster.(*Monitor).startMonitoringServer
    /Users/wisdom/gopath/src/github.com/10gen/sqlproxy/vendor/github.com/10gen/mongo-go-driver/cluster/monitor.go:190 +0x12f```

The panic is caused by this line: https://github.com/10gen/sqlproxy/blob/1ed515047a089fed7c64906948622743252eafeb/server/server.go#L103 though it's not clear why the driver bugs out because of that Close().

I suspect it is a problem in the Go driver but haven't been able to pin it down.



 Comments   
Comment by Craig Wilson [ 04/Sep/17 ]

https://mongodbcr.appspot.com/157350001/

Comment by Craig Wilson [ 31/Aug/17 ]

It occurs on this line: https://github.com/10gen/mongo-go-driver/blob/0373c3133c454844322613bea5a230f34465c830/cluster/monitor.go#L188

So, the cluster's changes channel has been closed before we've completed pulling all the server's changes. I believe this is because when we stop monitoring a server (https://github.com/10gen/mongo-go-driver/blob/0373c3133c454844322613bea5a230f34465c830/cluster/monitor.go#L194), we don't drain it's channel, we just unsubscribe. However, if there are still changes on the subscription channel prior to it getting closed, then those changes still come through. Closing a channel doesn't wipe it's contents. Hence, we still iterate and ultimately it's a race condition.

I believe we can solve this intentionally draining the changes on a subscription after unsubscribing such that no other changes will show up after we have closed the cluster's changes channel.

Generated at Thu Feb 08 08:33:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.