[JAVA-2778] Implicit sessions are sometimes leaked Created: 06/Feb/18 Updated: 08/Jan/24 Resolved: 12/Feb/18 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Query Operations |
| Affects Version/s: | 3.6.0, 3.6.1, 3.6.2 |
| Fix Version/s: | 3.6.3 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Stephen Machnowski | Assignee: | Jeffrey Yemin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
ISSUE DESCRIPTION This issue tracks several bugs with implicit session management that cause sessions to be leaked on some query operations. IMPACT
EXAMPLES A MongoCursor that is not closed will leaks an implicit session
Code which uses a cursor and properly closes it are not affected by the leak:
Any use of the first() method (or findOne() in the DBCollection API):
The only workaround for this manifestation of the leak is to replace with explicit use of a MongoCursor, as above. Original DescriptionWe recently upgraded our production replica set from MongoDB 3.4.11 to 3.6.2. The following error occurs every couple minutes and is output in the Mongo log on the primary. We have also found system.sessions is empty using $listSessions on the config db and an authorized user. On the other hand $listLocalSessions returns many in-memory sessions.
|
| Comments |
| Comment by Jeffrey Yemin [ 13/Feb/18 ] | |||||||||||||
|
smachnowski@vocabulary.com: fixes have been included in 3.6.3, released today. Please let us know if you continue to experience this issue after upgrading. | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Githook User [ 12/Feb/18 ] | |||||||||||||
|
Author: {'email': 'jeff.yemin@10gen.com', 'name': 'Jeff Yemin', 'username': 'jyemin'}Message: | |||||||||||||
| Comment by Jeffrey Yemin [ 07/Feb/18 ] | |||||||||||||
|
Thanks for the feedback. It's true that DBCursor still has a finalize method that will kill an open cursor on the server if the close method hasn't already been called and the cursor hasn't already been exhausted. But finalizers in Java are problematic for a number of reasons, many of which are discussed in "Effective Java, Item 7: Avoid finalizers" (here's one link to the text) and now also in the Java 9 Javadoc for the finalize method. The finalize method has also been deprecated in Java 9. I looked through our current documentation and can't find any remaining recommendation to not explicitly close cursors. If you're aware of any that we've missed, I'd appreciate it if you point it out. Going forward: to immediately reduce the session count on the server, I recommend that you downgrade back to the 3.4 Java driver while we work on a fix for a 3.6.3 patch release to address the issue. | |||||||||||||
| Comment by Marc Tinkler [ 07/Feb/18 ] | |||||||||||||
|
I’d also like to add that not explicitly closing cursors in Java used to be recommended, for performance reasons, since they would be closed in a separate thread by the Java finalized. | |||||||||||||
| Comment by Mira Carey [ 07/Feb/18 ] | |||||||||||||
|
smachnowski@vocabulary.com, you're absolutely correct that there is also a Server issue at play here. I've linked | |||||||||||||
| Comment by Stephen Machnowski [ 07/Feb/18 ] | |||||||||||||
|
Yes, we do have hundreds of uses of com.mongodb.DBCollection#findOne() as well as other DBCollection methods that delegate to the same underlying code. I think the issue may not be just the driver though. No matter how many active server sessions there are, the mongodb server should be able to issue a refresh (persist) them to the system.sessions collection. The original error message appears to be a fully server-side error in it's session management.
To provide more information, I'd like to note we have reduced the following server parameter to 1 minute (logicalSessionRefreshMinutes). With the current rate local in-memory sessions ($listLocalSessions) are accumulating on the server side, the server is able to "keep up" and persist them, meaning no BSON size assertion errors are thrown because the BSON update object fits into 16MB.
I am adding screen shots of the graph from Mongo Cloud Manager displaying how every 5 minutes, processing this massive 30+ million local session cache containing orphaned/expired sessions affected performance. | |||||||||||||
| Comment by Jeffrey Yemin [ 06/Feb/18 ] | |||||||||||||
|
As this now appears to be an issue with the Java driver I've moved it to the JAVA project. | |||||||||||||
| Comment by Jeffrey Yemin [ 06/Feb/18 ] | |||||||||||||
|
Hi smachnowski@vocabulary.com, We found a session leak in the Java driver's implementation of MongoIterable#first in com.mongodb.FindIterableImpl, which would typically be encountered via a call like:
A similar leak exists in com.mongodb.DBCollection#findOne(). Can you check if your application calls either of those methods? If not, the other likely cause is a situation where either a com.mongodb.DBCursor or a com.mongodb.client.MongoCursor is not closed by the application, e.g.
Please also audit your application for this pattern. | |||||||||||||
| Comment by Stephen Machnowski [ 06/Feb/18 ] | |||||||||||||
|
MongoDB servers are running on Amazon Linux (i3.4xlarge): Driver is Mongo Java Driver 3.6 from: We noticed the error because we are troubleshooting intermittent, sudden spikes of bad performance. This error message preceded the bad performance in the cases we looked at. | |||||||||||||
| Comment by Daniel Pasette (Inactive) [ 06/Feb/18 ] | |||||||||||||
|
Hi Stephen. Can you tell use which driver, version and platform you're running on? | |||||||||||||
| Comment by Stephen Machnowski [ 06/Feb/18 ] | |||||||||||||
|
This is a non-sharded replica set, so there is no config server. | |||||||||||||
| Comment by Ramon Fernandez Marina [ 06/Feb/18 ] | |||||||||||||
|
smachnowski@vocabulary.com, can you please upload the logs for the config server primary? I think that's the first step in tracking this down. | |||||||||||||
| Comment by Stephen Machnowski [ 06/Feb/18 ] | |||||||||||||
|
And this is a later result from serverStatus() to illustrate the pattern. activeSessionsCount is 4561421, but lastSessionsCollectionJobEntriesRefreshed = 0.
| |||||||||||||
| Comment by Stephen Machnowski [ 06/Feb/18 ] | |||||||||||||
|
From db.serverStatus(), displays we have 2.46M active sessions. We have not called the new Mongo.startSession() feature yet from any client code. The status document below shows the lastSessionsCollectionJobTimestamp was just shortly after we started the server, but the session refresh hasn't completed since then.
|