[JAVA-2074] Mongo client is leaking unclosed connection thread. Created: 05/Jan/16  Updated: 16/Feb/18  Resolved: 25/Jan/16

Status: Closed
Project: Java Driver
Component/s: Cluster Management
Affects Version/s: 2.12.0
Fix Version/s: 3.2.2, 2.14.2

Type: Bug Priority: Minor - P4
Reporter: Bin Lan Assignee: Jeffrey Yemin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related

 Description   

On cloud manager, we have some cron job that creates new MongoClient to all the backing mongodb process to make sure that they are up and without startup warning.

We observe that a lot of these ad-hoc clients are not being closed properly. On our side, we do guard the connection instance with a try/finally to make sure the close() method is called.

An example thread dump line looks like this:

"cluster-374495-hostname:port" #817624 daemon prio=5 os_prio=0 tid=0x00007ff9c8276000 nid=0x33b8 waiting on condition [0x00007ff90594e000]

Note that the counter #817624 is relatively large and we do find multiple instance to the same host on the same thread dump. This points to a potential MongoClient leak.

And the clients seem to be all from arbiter processes since they have all the higher counter number thread.



 Comments   
Comment by Githook User [ 16/Feb/18 ]

Author:

{'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}

Message: JAVA-2074: Fix race condition in MultiServerCluster checking whether the instance is closed.
Branch: 2.14.x
https://github.com/mongodb/mongo-java-driver/commit/a3d6b8e542b98e63d6af5a79149fb31c73e98bbc

Comment by Jeffrey Yemin [ 31/Mar/16 ]

2.14.2 has been released with a fix for this bug.

Comment by Jeffrey Yemin [ 16/Feb/16 ]

Closed for 3.2.2 release

Comment by Githook User [ 25/Jan/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2074: Fix race condition in MultiServerCluster checking whether the instance is closed.
Branch: 3.2.x
https://github.com/mongodb/mongo-java-driver/commit/15ad618d5871fbf170b9fc4b34bddef99cf6f840

Comment by Githook User [ 25/Jan/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2074: Fix race condition in MultiServerCluster checking whether the instance is closed.
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/b96a0db46f9852347467e9629745789202a53887

Comment by Githook User [ 25/Jan/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2074: Fix race condition in MultiServerCluster checking whether the instance is closed.
Branch: 2.x
https://github.com/mongodb/mongo-java-driver/commit/5577c5b200e7adaacb21938532b69e27979f49e0

Comment by Githook User [ 25/Jan/16 ]

Author:

{u'username': u'jyemin', u'name': u'Jeff Yemin', u'email': u'jeff.yemin@10gen.com'}

Message: JAVA-2074: Fix race condition in MultiServerCluster checking whether the instance is closed.
Branch: 2.14.x
https://github.com/mongodb/mongo-java-driver/commit/a3d6b8e542b98e63d6af5a79149fb31c73e98bbc

Comment by Jeffrey Yemin [ 20/Jan/16 ]

Changing priority to minor, as this bug will only be exposed in the rare circumstance where a MongoClient is constructed but never actually used before it's closed.

Comment by Jeffrey Yemin [ 05/Jan/16 ]

Yes, that was perfectly clear.

Comment by Bin Lan [ 05/Jan/16 ]

Hi Jeff,

I might be misleading on the language of the ticket. It is not the Mongo
that is leaking, but rather the ServerMonitor's monitorThread is
leaking.

Comment by Jeffrey Yemin [ 05/Jan/16 ]

The Mongo.close() method interrupts the MongoCleaner thread and then calls join on it. Therefore, MongoClient.close() will not even return normally until that thread dies. So if you are also observing many alive MongoCleaner threads, it would suggest that there is a path where MongoClient instances are not being closed properly.

Comment by Jeffrey Yemin [ 05/Jan/16 ]

I also failed to reproduce this with a direct connection to an arbiter.

Comment by Jeffrey Yemin [ 05/Jan/16 ]

I'm not yet able to reproduce this. I created a replica set with three data bearing nodes and one arbiter, running on ports 27017-27020, and ran the following script:

        MongoClient client = new MongoClient(new MongoClientURI("mongodb://localhost,localhost:27018"));
 
        try {
            client.getDB("admin").command(new BasicDBObject("ping", 1));
        } catch (MongoException e) {
            // do nothing
        } finally {
            Thread.sleep(5000);  // wait long enough for all background threads to be created
            client.close();
        }
 
        Thread.sleep(5000);  // wait long enough for all background threads to die
        Thread.sleep(5000);

Stepping through in the debugger, by the time I get to the last sleep an inspection of the running threads shows none that were created by the driver.

To help debug this further, can you provide a code snippet showing the construction of one of the MongoClient instances that appear to be leaking threads?

Comment by Jeffrey Yemin [ 05/Jan/16 ]

The number after "cluster" represents the cluster id, which is implemented as a static AtomicInteger. So that thread name also implies that the application has created at least 374,495 MongoClient instances in a single class loader context.

Generated at Thu Feb 08 08:56:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.