[JAVA-1883] Using the async driver, no references should remain to a callback after it has been called back Created: 10/Jul/15  Updated: 01/Apr/16  Resolved: 28/Sep/15

Status: Closed
Project: Java Driver
Component/s: Async
Affects Version/s: 3.0.0
Fix Version/s: 3.1.0

Type: Bug Priority: Major - P3
Reporter: Peter Hendriks Assignee: Jeffrey Yemin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Java Source File RemainingUpdateRequestsExample.java    

 Description   

When using the async API, we have noticed an increase in memory use when executing queries that collect a large amount of documents. A heap dump analysis shows that the connection keeps a reference to a callback, even after it has been called. As the connection itself will be pooled, this may lead to memory "leak" issues, especially when the connection pool is configured to a large maximum size.



 Comments   
Comment by Peter Hendriks [ 08/Oct/15 ]

Thanks!

Comment by Jeffrey Yemin [ 07/Oct/15 ]

Released in 3.1.0

Comment by Githook User [ 28/Sep/15 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-1883: Since the AsynchronousSocketChannelStream implementation provided by the JDK may store references to completion handlers long after they are no longer used, ensure that the AsynchronousSocketChannelStream completion handlers release their references to upstream handlers as soon as they are no longer needed.
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/2ffdcf5d361d03754161531c5bc0408475d1df85

Comment by Jeffrey Yemin [ 25/Sep/15 ]

Hi Peter,

Thanks for the detective work.

The real offender here is sun.nio.ch.UnixAsynchronousSocketChannelImpl, which does not null out its references to the completion handlers after calling them. But the driver code can work around this by nulling out its own upstream references. So we can fix this.

Comment by Peter Hendriks [ 24/Sep/15 ]

@jeff.yemin: it was a bit hard to isolate the code into a small example for the find() scenario, but we ran into a different scenario doing updates/replaces, which I was able to get into a small example case. I think the two scenarios may trigger the same issue so I've added the example to this issue.

Basically, while doing updates, the parameters are captured in an UpdateRequest object by the Mongo Driver, that is used for processing the request. However, after the request has completed, the UpdateRequest is still referenced by the connection in the connection pool that executed the request. When the connection processes a new request, the reference is overwritten and only then, the processed request can be garbage collected.

For large connection pools combined with large parameter objects (e.g. a large document to replace) this leads to significant additional memory use. In our case, this adds an additional gigabyte of memory use, which has noticeable impact on performance.

It seems that the connection could de-reference any request objects after execution has been completed, as they are no longer needed at that point. That would help reduce the amount of objects that is kept live by the connection pool and would improve overall performance.

Comment by Peter Hendriks [ 24/Sep/15 ]

Example that demonstrates remaining com.mongodb.bulk.UpdateRequest objects.

Comment by Jeffrey Yemin [ 10/Jul/15 ]

peterhendriks would it be possible to supply a small sample application that demonstrates the problem?

Generated at Thu Feb 08 08:55:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.