[GODRIVER-1815] mongo client has i/o timeout after network exception Created: 29/Dec/20  Updated: 29/Jan/21  Resolved: 29/Jan/21

Status: Closed
Project: Go Driver
Component/s: Networking
Affects Version/s: 1.3.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Evan Pang Assignee: Benji Rewis (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

mongodb-server 3.4
mongo-go-driver 1.3.2


Issue Links:
Duplicate
is duplicated by GODRIVER-1814 mongo client has i/o timeout after ne... Closed

 Description   

Hi,

I have a micro-service based on mongo client.

All is well, however, after some days, A query operations Find(xxx) failed with errors as below:

connection(192.168.20.194:3717[-123425]) incomplete read of message header: read tcp 10.233.26.252:47192->192.168.20.194:3717: use of closed network connection

and the following query operations, such as a lot of Operations Find(xxx),  failed with errors like these:

context deadline exceeded

there is no one connection between my-server and mongo-server when I type command `netstat -apn`.

Then I tried to capture tcp package by tcpdump.  also no one tcp package occured.



 Comments   
Comment by Benji Rewis (Inactive) [ 29/Jan/21 ]

Closing this issue for the time being 760802635@qq.com. If you have any other questions or concerns after updating, feel free to comment on this ticket again. And, thanks again for you report!

-Benji

Comment by Evan Pang [ 29/Jan/21 ]

Hi Benji,

Thanks for your feedback. And I will upgrade driver version to 1.3.4 on production env. Any question will let u known, thanks again.

Comment by Benji Rewis (Inactive) [ 26/Jan/21 ]

Hello again!

Apologies for the delay, 760802635@qq.com. Thanks again for your report. We’re still working on reproducing the error, but it could be caused by a number of issues.

Broadly, network errors can be classified into timeout errors and non-timeout errors. As of driver version 1.3.4, when a non-timeout error occurs, we clear the whole connection pool, and when a timeout error occurs, we close the single, checked-out connection.

We recommend that you upgrade your driver version to at least version 1.3.4, as a bug (GODRIVER-1620) was fixed where timeout errors would incorrectly clear the whole pool. That could be the cause of those “context deadline exceeded” errors.

It’s also possible that the timeout of your client is not long enough, and that after a connection has been closed due to an error, the client does not have enough time to find a new connection before it times out.

Finally, the “context deadline exceeded” errors could be happening because your Mongo servers are unhealthy, and there are no remaining, selectable servers available in the deployment.

We’ll continue trying to reproduce the error, but hopefully that’s helpful in the meantime

-Benji

Comment by Divjot Arora (Inactive) [ 12/Jan/21 ]

Hi 760802635@qq.com,

Thank you for the bug report. As an update, I'm working on a reproduction script for the issue and will try to test it on different driver versions to see if any of the changes we've made after the 1.3.2 release alleviate it.

– Divjot

Generated at Thu Feb 08 08:37:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.