[JAVA-4851] AsyncGetter thread got stuck and all db transactions were got failed Created: 23/Jan/23  Updated: 27/Oct/23  Resolved: 14/Feb/23

Status: Closed
Project: Java Driver
Component/s: Connection Management
Affects Version/s: 3.12.9
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: senthil kumar c Assignee: Jeffrey Yemin
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File Stack_trace.txt    

 Description   

Summary

All the db transactions were suddenly started to fail. Below logs were continuously coming. 

- Timeout waiting for a pooled item after 162 MILLISECONDS
- Too many operations are already waiting for a server. Max number of operations (maxWaitQueueSize) of 500 has been exceeded

From the thread dump, we could see, almost all the AsyncGetter got stuck in TIMED_WAITING state. We took multiple thread dumps in the interval of 5 seconds. In all the thread dumps, observation is same w.r.t AsyncGetter thread.
We recovered the setup by restarting the application process. Post restart, no issues were seen and again we took multiple thread dumps and it looks fine i.e. none of the AsyncGetter thread was in TIMED_WAITING.

Java Mongo client driver version: 3.12.9
Mongo server version: 3.6

How to Reproduce

No definitive steps. Issue is random and inconsistent.

Additional Background

Irrespective of load, this issue is coming. Once issue got hit then all the further db transactions are failing.  Attached the stack trace(AsyncGetter) which have taken during issue and non-issue time.



 Comments   
Comment by PM Bot [ 14/Feb/23 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information.

Comment by Jeffrey Yemin [ 30/Jan/23 ]

If none of the existing topics are helpful, I suggest that you ask your question in a new topic on the community forum. There are a wider array of community members there that might be able to help, whereas this Jira project is specifically for reporting bugs and requesting new features, not for general support.

A few things to get you started:

  • JAVA-3927 is not applicable to 3.12.9 and will not fix this issue. Though in general we recommend upgrading your driver to the latest supported version, which currently is 4.8
  • "Timeout waiting for a pooled item after 162 MILLISECONDS" is very suspicious, as 162 milliseconds is quite short. So that's unexpected, and you should check your MongoClient settings/connection string for timeouts that are too short. If you don't see anything suspicious there, that could be an error in the exception message. Are you actually seeing requests time out after 160 milliseconds?
Comment by senthil kumar c [ 30/Jan/23 ]

Hi Jeffrey,

Thanks for the response. Actually, we are using community edition only.

I checked the mongo community forum and found the similar issue reported. But it didn't help us.

https://www.mongodb.com/community/forums/t/any-chance-to-get-a-new-java-driver-release-with-latest-bug-fixes/106377

Also we came across the below ticket, "Deadlock in `DefaultConnectionPool`" fixed as part of it.

https://jira.mongodb.org/browse/JAVA-3927?anonymousId=2c165c26-3bdd-4ecd-b805-4f2a087ba77b

Is this applicable to 3.12.9 and also will this fix the reported issue?

Could you please check it and provide inputs, which would really helps us to get rid of this issue. 

Thanks,

Senthil

Comment by Jeffrey Yemin [ 23/Jan/23 ]

Hi there, thank you for reaching out. As this sounds like a support issue, I wanted to give you some resources to get this question answered more quickly:

  • Our MongoDB support portal, located at support.mongodb.com
  • Our MongoDB community portal, located here
  • If you are an Atlas customer, you can also: Click the in-app chat icon in the lower right corner to chat with a MongoDB Support representative OR Click Support in the left-hand navigation to view Developer Resources.

Just in case you have already opened a support case and are not receiving sufficient help, please let me know and I can facilitate escalating your issue.

Thank you!

Generated at Thu Feb 08 09:03:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.