[JAVA-3087] Buffer leak when network compression enabled Created: 09/Nov/18  Updated: 28/Oct/23  Resolved: 13/Nov/18

Status: Closed
Project: Java Driver
Component/s: Wire Protocol
Affects Version/s: 3.7.0
Fix Version/s: 3.9.1

Type: Bug Priority: Critical - P2
Reporter: Colin Bester Assignee: Jeffrey Yemin
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu linux LTS 18.x
MongoDB Atlas 3.6.8
Mongo-scala-driver 2.3
Scala 2.11.11


Attachments: Text File rest-error.log    

 Description   

We have been running same s/w and configuration for couple of months with no issue and on 26 Oct we made change to RestFul server configuration to enable network compression using snappy by setting compressor=snappy when initialising mongo-scala-driver, but yesterday (8th Nov) we had system failure and clients were getting “Exception Boxed Error” on our web application (Running liftweb 3.1 on Jetty 9.2.13).

Our system as seen from client end is angularjs -> liftweb application (running on jetty) -> AKKA-Http RestFul server with mongo-scala-driver -> Atlas Mongo.

Only error I see in restFul log files is "ERROR io.netty.util.ResourceLeakDetector - LEAK: ByteBuf.release() was not called before it's garbage-collected”. Looking through log files the earlies I see the error is on the 26 - tied to same day we made change to enable network compression.

We have production and pre-stage systems, both employing same software versions, one using M40 Mongo Atlas and the other M10 Mongo Atlas - both show this error.
On my development server which is running same s/w on same Linux version but communicating to single instance MongoDB (same version as production and pre-stage) does not show this error with same snappy compression added.

If I disable snappy compression on pre-staging and production server then it appears that I don't see the error. Time is still to early to tell if this is fact or not, but if I enable the compression I see error within 30 minutes. I have disabled compression for 3 hours now and have not yet seen the error.

I enabled advanced logging to try and pinpoint memory leak which lead me to mongodb - I have attached log file after restarting server.

Is there by chance a know issue with enabling compression?
I assume from my end there is no different in coding solution if compression is enabled or not?

I am hoping someone can help guide me in right direction.



 Comments   
Comment by Colin Bester [ 26/Dec/18 ]

Thanks. I think that closes this thread - thanks for all the help.

Comment by Jeffrey Yemin [ 26/Dec/18 ]

Yes, it's fine. Generally it's safe to upgrade to a patch release so long as a library follows semantic versioning, which the Java driver does.

Comment by Colin Bester [ 24/Dec/18 ]

So overriding async driver to 3.9.1 with mongo-scala-driver 2.5.0 will be okay?

 

Comment by Jeffrey Yemin [ 24/Dec/18 ]

We haven't done a 2.5.1 release yet of the Scala driver that directly depends on 3.9.1 of the async Java driver, but you can always take a direct dependency on 3.9.1 in your own build file, and that will override Scala driver's dependency.

Comment by Colin Bester [ 24/Dec/18 ]

Thanks, then I assume it's not rolled out to Scala driver yet?

Comment by Jeffrey Yemin [ 24/Dec/18 ]

Hi colin.bester@advancedbiotechnologiesllc.com,

The bug fix is available in the 3.9.1 release of mongodb-driver-async, available on Maven Central: http://central.maven.org/maven2/org/mongodb/mongodb-driver-async/3.9.1/

Comment by Colin Bester [ 24/Dec/18 ]

Any idea when bug fix (Netty/Compression memory leak) https://oss.sonatype.org/content/repositories/snapshots/org/mongodb/mongodb-driver-async/3.9.1-SNAPSHOT/ will be added to java and scala projects so that I can update my production project?

Comment by Colin Bester [ 20/Nov/18 ]

Thanks @Ross

Comment by Ross Lawley [ 20/Nov/18 ]

I didn’t see upgrade notes for 2.5.0 so am only upgrading to 2.4.2 which is a little sad.

Thanks colin.bester@advancedbiotechnologiesllc.com I've only added upgrade notes when there has been a change that requires extra consideration for upgrading.

I can see that with recent releases that is non obvious, so I'll explicitly add notes for all minor releases from now on.

Comment by Colin Bester [ 19/Nov/18 ]

Now I am happy

I am rebuilding for scala 2.4.2 to try the patched java version.

I didn’t see upgrade notes for 2.5.0 so am only upgrading to 2.4.2 which is a little sad.

~C

Comment by Jeffrey Yemin [ 19/Nov/18 ]

Yes, it's Netty + compression

Comment by Colin Bester [ 19/Nov/18 ]

So Netty plus compression and not just netty enabled without compression? I definitely don’t see error in log file if netty is enabled and network compression is disabled.

Sent from my iPhone

Comment by Jeffrey Yemin [ 19/Nov/18 ]

It has nothing to do with SSL, and everything to do with how the driver uses Netty's io.netty.buffer.ByteBuf. I can reproduce this issue whether or not SSL is enabled.  The only requirement is that Netty is enabled.

Comment by Colin Bester [ 19/Nov/18 ]

Yup, adding streamType causes leak. Seeing as issue only shows with netty enabled and compression I assume issue is to do with compression with SSL?

Thanks so much for all your assistance. I will try patch fix in today or tomorrow and fix. I am hoping I don't have to change my Mongo-scala driver version which is mongo-scala-driver_2.11-2.3.0 as I believe it uses java 3.7.0

Comment by Jeffrey Yemin [ 18/Nov/18 ]

Just insert streamType=netty. SSL is not required to use Netty; rather, Netty is required to use SSL (with the async driver).

Comment by Colin Bester [ 17/Nov/18 ]

I haven't loaded the snapshot jar file yet as I wanted to first be able to replicate error and then I will try.

In order to enable Netty on my local setup do I just need to insert 'streamType=netty' as option or do I have to setup SSL and certificates on my local MongoDB instance?

Comment by Jeffrey Yemin [ 17/Nov/18 ]

That explains why Netty is not reporting leaks in your local setup: you're not using Netty there. Please enable Netty for your local setup and confirm that you see the leaks. Also, please double check that you're loading InternalStreamConnection from the expected jar file. Looking at the stack trace from your last comment:

com.mongodb.connection.ByteBufferBsonOutput.close(ByteBufferBsonOutput.java:181)
com.mongodb.connection.InternalStreamConnection.sendAndReceiveAsync(InternalStreamConnection.java:324)

which still corresponds to the line numbers from the 3.9.0 release: https://github.com/mongodb/mongo-java-driver/blob/r3.9.0/driver-core/src/main/com/mongodb/internal/connection/InternalStreamConnection.java#L324

Compare with what's on line 324 in the 3.9.1-SNAPSHOT release: https://github.com/mongodb/mongo-java-driver/blob/3.9.x/driver-core/src/main/com/mongodb/internal/connection/InternalStreamConnection.java#L324: there's no executable code on that line. So I suspect that either you're not using the 3.9.1-SNAPSHOT or we didn't publish it correctly after the change was committed.

Comment by Colin Bester [ 17/Nov/18 ]

Several days on test setup with no Leak error messages.

This morning I modified configuration to point to my staging (cluster0) Atlas cluster and issue was there out the bat.
 

LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
 WARNING: 196 leak records were discarded because the leak record count is limited to 4. Use system property io.netty.leakDetection.maxRecords to increase the limit.
 Recent access records: 4
 #4:
 io.netty.buffer.AdvancedLeakAwareByteBuf.release(AdvancedLeakAwareByteBuf.java:724)
 com.mongodb.connection.netty.NettyByteBuf.release(NettyByteBuf.java:239)
 com.mongodb.connection.ByteBufferBsonOutput.close(ByteBufferBsonOutput.java:181)
 com.mongodb.connection.InternalStreamConnection.sendAndReceiveAsync(InternalStreamConnection.java:324)
 com.mongodb.connection.UsageTrackingInternalConnection.sendAndReceiveAsync(UsageTrackingInternalConnection.java:113)
 com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendAndReceiveAsync(DefaultConnectionPool.java:452)
 com.mongodb.connection.CommandProtocolImpl.executeAsync(CommandProtocolImpl.java:76)
 com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.executeAsync(DefaultServer.java:203
....

 So still no idea why no LEAK errors when connecting to standalone database vs Atlas. 

On Scala side my options to URL for local setup is:

options = "retryWrites=true&authSource="${mongo.database}"&authMechanism="${mongo.authMechanism}"&compressors=snappy"

and for Atlas setup is

options = "streamType=netty&retryWrites=true&authSource=admin&authMechanism="${mongo.authMechanism}"&compressors=snappy"

Only difference I see is inclusion of streamType=netty

Comment by Colin Bester [ 15/Nov/18 ]

I haven't yet but will do

Comment by Jeffrey Yemin [ 15/Nov/18 ]

I don't have an explanation for this. Have you tried io.netty.leakDetection.level=paranoid?

Comment by Colin Bester [ 15/Nov/18 ]

Only difference between the two setups is that internal setup is not using SSL while the staging system is employing SSL (as can be seen by setting streamType in MongoDB URL.

I confirm that I do not see the leak message on my test server running dedicated and standalone MongoDB instance.

I don't have much control on Atlas configuration but here are snippets showing command line invocation of servers and connection URL together with other information.

Both dedicated server and staging server are running same Restful application s/w version which handles MongoDB connection and read/writes.

Dedicated server is running MongoDB 3.6.8 and Atlas is running MongoDB 3.6.8 on M10 instance

Dedicated Rest Server with standalone mongodb instance running on same host as restful server
Output of ps:

root     31694     1  0 14:25 ?        00:00:34 /usr/bin/java -Dio.netty.leakDetection.level=advanced -cp resources:spray-rest.jar:ext/*:dependencies/* com.besterdesigns.bc.rest.RestMicroService

Database configuration in Rest server is via URL - my application config is:

{ .... 
  prefix="mongodb"
  options = "retryWrites=true&authSource="${mongo.database}"&authMechanism="${mongo.authMechanism}"&compressors=snappy"
  url = ${mongo.prefix}"://"${mongo.username}":"${mongo.password}"@"${mongo.host}"/"${mongo.database}"?"${mongo.options}
}

Java version:

colinbes@besterdesigns-ubuntu:~> java -version
openjdk version "1.8.0_181
"OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.18.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

Database logs show compression in action:

2018-11-15T20:26:51.780Z D NETWORK  [conn9] Compressing message with snappy
2018-11-15T20:26:51.783Z D NETWORK  [conn9] Decompressing message with snappy
2018-11-15T20:26:51.783Z D COMMAND  [conn9] run command bcmongo.$cmd { find: "device", filter: { _id: { $in .....

Staging server using Atlas

Output of ps:

root       968     1  0 Nov13 ?        00:10:08 /usr/bin/java -Dio.netty.leakDetection.level=advanced -cp resources:spray-rest.jar:ext/*:dependencies/* com.besterdesigns.bc.rest.RestMicroService

Database configuration in Rest server is via URL:

{   
....
  prefix="mongodb+srv"
  options = "streamType=netty&retryWrites=true&authSource=admin&authMechanism="${mongo.authMechanism}"&compressors=snappy"
  url = ${mongo.prefix}"://"${mongo.username}":"${mongo.password}"@"${mongo.host}"/"${mongo.database}"?"${mongo.options}
}

Java version:

bcdemo@demo:~$ java -versionopenjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)

Just wanting to make sure all bases are covered in case there is more than one issue.

Comment by Jeffrey Yemin [ 15/Nov/18 ]

There must be something different either or in your configuration of the client or of the server. The patch is agnostic to server topology: if compression is enabled, the leak should manifest.

Comment by Colin Bester [ 15/Nov/18 ]

My standalone does have snappy compression enabled and I verified it was being used by examining logs for "decompressing" and "compressing" messages.

I am hoping to get new build compiled today (excluding fix to java mongo) and will confirm.

Comment by Jeffrey Yemin [ 15/Nov/18 ]

No, but I suspect that your standalone didn't have Snappy compression enabled, so compression wasn't actually happening even though you configured it in the driver.

Comment by Colin Bester [ 15/Nov/18 ]

Does the issue you found explain why leak only occurred when connection to Atlas and not a standalone single mongodb?

Comment by Colin Bester [ 13/Nov/18 ]

Thanks. I am in throws of finalizing testing of new features and will include testing 3.9.1 for stage 2 testing.

 

Comment by Jeffrey Yemin [ 13/Nov/18 ]

The 3.9.1-SNAPSHOT has been published to Sonatype. See https://oss.sonatype.org/content/repositories/snapshots/org/mongodb/mongodb-driver-async/3.9.1-SNAPSHOT/

Comment by Jeffrey Yemin [ 13/Nov/18 ]

will this change roll over to scala project as well (update dependencies)

Eventually, but for now you'll need to override the transitive dependency from the Scala driver by explicitly depending on 3.9.1-SNAPSHOT

is there a reason I can't use mention feature?

Not sure

I never tested using zlib compression but assume issue would have presented itself as well if I had used zLib - would you agree and would fix address both snappy and zlib?

Yes, the fix will apply to any type of compression.

Comment by Colin Bester [ 13/Nov/18 ]

I never tested using zlib compression but assume issue would have presented itself as well if I had used zLib - would you agree and would fix address both snappy and zlib?

Comment by Colin Bester [ 13/Nov/18 ]

Off topic I know, but is there a reason I can't use mention feature?

Comment by Colin Bester [ 13/Nov/18 ]

Thanks Jeff, will this change roll over to scala project as well (update dependencies)?

 

Comment by Jeffrey Yemin [ 13/Nov/18 ]

Hi colin.bester@advancedbiotechnologiesllc.com there should be a 3.9.1-SNAPSHOT available in a few hours that contains a fix for this issue.  Thanks again for the report.

Comment by Githook User [ 13/Nov/18 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@10gen.com', 'username': 'jyemin'}

Message: Fix buffer leak during compression

Ensure that buffers used for uncompressed message prior to compression
are released.

JAVA-3087
Branch: 3.9.x
https://github.com/mongodb/mongo-java-driver/commit/0c01d71dae62863e59575013f6441d005aaa804c

Comment by Githook User [ 13/Nov/18 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@10gen.com', 'username': 'jyemin'}

Message: Fix buffer leak during compression

Ensure that buffers used for uncompressed message prior to compression
are released.

JAVA-3087
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/525070b349e5a3d601a5bbab17de0c4bc82f087f

Comment by Colin Bester [ 12/Nov/18 ]

I've been running production and staging server without snappy network compression enabled for close to three days with no error. This morning I re-enabled snappy network compression on staging server and restarted application and within the hour I saw same error again

 

 

2018-11-12 08:11:01 [cluster-ClusterId{value='5be975b4b0aba366f1109469', description='null'}-cluster0-shard-00-01-docco.mongodb.net:27017] ERROR io.netty.util.ResourceLeakDetector - LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
WARNING: 293 leak records were discarded because the leak record count is limited to 4. Use system property io.netty.leakDetection.maxRecords to increase the limit.
Recent access records: 4
#4:
io.netty.buffer.AdvancedLeakAwareByteBuf.release(AdvancedLeakAwareByteBuf.java:724)
com.mongodb.connection.netty.NettyByteBuf.release(NettyByteBuf.java:239)
com.mongodb.connection.ByteBufferBsonOutput.close(ByteBufferBsonOutput.java:181)
com.mongodb.connection.InternalStreamConnection.sendAndReceiveAsync(InternalStreamConnection.java:324)
com.mongodb.connection.UsageTrackingInternalConnection.sendAndReceiveAsync(UsageTrackingInternalConnection.java:113)
com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendAndReceiveAsync(DefaultConnectionPool.java:452)
com.mongodb.connection.CommandProtocolImpl.executeAsync(CommandProtocolImpl.java:76)
com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.executeAsync(DefaultServer.java:203)
com.mongodb.connection.DefaultServerConnection.executeProtocolAsync(DefaultServerConnection.java:280)
com.mongodb.connection.DefaultServerConnection.commandAsync(DefaultServerConnection.java:151)
com.mongodb.connection.DefaultServerConnection.commandAsync(DefaultServerConnection.java:142)

 

Comment by Colin Bester [ 09/Nov/18 ]

Thanks @Jeff Yemin , appreciated. 

 

Comment by Jeffrey Yemin [ 09/Nov/18 ]

Hi colin.bester@advancedbiotechnologiesllc.com  Thank you for this report.  Since the compression support is in the underlying Java driver, I've move this issue into the JAVA project.  This is not a known issue so we will investigate as soon as possible.  For now, please keep compression disabled until we are able to provide a fix.

Generated at Thu Feb 08 08:58:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.