[JAVA-1931] com.mongodb.MongoSocketWriteException when using Async driver + Netty Created: 24/Aug/15  Updated: 27/Jun/17  Resolved: 13/Apr/16

Status: Closed
Project: Java Driver
Component/s: Async
Affects Version/s: 3.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Frederico Ferro Schuh Assignee: Ross Lawley
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7, application built on Vertx 3 + Netty 4.0.28


Attachments: Java Source File TestMongoSocketWriteException.java    
Issue Links:
Related
is related to JAVA-1934 Netty readTimeout doesn't work as exp... Closed
is related to JAVA-1973 AbstractSubscription implementations ... Closed

 Description   

I'm using the new Mongo RxJava async driver, and I'm running into issues when I run it through Netty.

When my application idles for about 1 minute without doing any database operation, in the next operation I'll get the following Exception:

com.mongodb.MongoSocketWriteException: Exception sending message
	at com.mongodb.connection.InternalStreamConnection.translateWriteException(InternalStreamConnection.java:462) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.InternalStreamConnection.access$1000(InternalStreamConnection.java:65) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.InternalStreamConnection$2.failed(InternalStreamConnection.java:320) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.netty.NettyStream$3.operationComplete(NettyStream.java:170) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.netty.NettyStream$3.operationComplete(NettyStream.java:166) ~[mongodb-driver-core-3.0.2.jar:?]
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) [netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) [netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
Caused by: java.nio.channels.ClosedChannelException
19:49:01,201 ERROR vert.x-eventloop-thread-4 com.ubi.petz.verticle.RestVerticle.lambda$processRequest$101(RestVerticle.java:367) - Error on Action: {"@class":"com.ubi.petz.action.SavePlayerAction","action_time":1,"inter_serial_num":2,"player_id":"test_id","name":"new name"}
com.ubi.petz.exception.ActionException: com.mongodb.MongoSocketWriteException: Exception sending message
	at com.ubi.petz.verticle.RestVerticle.lambda$null$95(RestVerticle.java:272) ~[main/:?]
	at com.ubi.petz.verticle.RestVerticle$$Lambda$100/370109114.call(Unknown Source) ~[?:?]
	at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$1.onError(OperatorOnErrorResumeNextViaFunction.java:99) [rxjava-1.0.13.jar:1.0.13]
	at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.pollQueue(OperatorObserveOn.java:197) [rxjava-1.0.13.jar:1.0.13]
	at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber$2.call(OperatorObserveOn.java:170) [rxjava-1.0.13.jar:1.0.13]
	at com.ubi.petz.core.rx.VertxContextScheduler$WorkerImpl$TimedAction.run(VertxContextScheduler.java:108) [main/:?]
	at com.ubi.petz.core.rx.VertxContextScheduler$WorkerImpl$TimedAction.lambda$new$77(VertxContextScheduler.java:96) [main/:?]
	at com.ubi.petz.core.rx.VertxContextScheduler$WorkerImpl$TimedAction$$Lambda$42/715081036.handle(Unknown Source) [main/:?]
	at io.vertx.core.impl.ContextImpl.lambda$wrapTask$15(ContextImpl.java:314) [vertx-core-3.0.0.jar:?]
	at io.vertx.core.impl.ContextImpl$$Lambda$8/1127177174.run(Unknown Source) [vertx-core-3.0.0.jar:?]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) [netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) [netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.28.Final.jar:4.0.28.Final]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
Caused by: com.mongodb.MongoSocketWriteException: Exception sending message
	at com.mongodb.connection.InternalStreamConnection.translateWriteException(InternalStreamConnection.java:462) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.InternalStreamConnection.access$1000(InternalStreamConnection.java:65) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.InternalStreamConnection$2.failed(InternalStreamConnection.java:320) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.netty.NettyStream$3.operationComplete(NettyStream.java:170) ~[mongodb-driver-core-3.0.2.jar:?]
	at com.mongodb.connection.netty.NettyStream$3.operationComplete(NettyStream.java:166) ~[mongodb-driver-core-3.0.2.jar:?]
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) ~[netty-transport-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-common-4.0.28.Final.jar:4.0.28.Final]
	... 1 more
Caused by: java.nio.channels.ClosedChannelException

By looking at the stack trace you can't tell exactly which operation caused the error because of the asynchronous Observables, but the above Exception happened on MongoCollection.replaceOne() operation.
This happens 100% of the times, but only when I tell Mongo to use Netty (by calling System.setProperty("org.mongodb.async.type", "netty") as mentioned in the docs). It happens on read operations as well.
If Mongo doesn't use Netty, everything runs perfectly well.
My application is already using Netty for network I/O, so I thought I'd leverage it with MongoDB for increased performance.
Could this be an issue on the driver?
I'm using the 3.0.2 Java driver, on top of the RxJava 1.0.1 driver. Running MongoDB 2.6.7. Netty 4.0.28.

As this is a very young driver, I understand there may still be bugs, and I'd be glad if this stack trace could be of any use to identify them.



 Comments   
Comment by Frederico Ferro Schuh [ 13/Oct/15 ]

Thanks Ross! Updated and working great!

Comment by Ross Lawley [ 08/Oct/15 ]

Hi, just to let you know the rx driver 1.1 was released today!

Comment by Ross Lawley [ 22/Sep/15 ]

Great stuff, I'm closing this ticket now!

Thanks again for your help and contribution - happy MongoDB'ing!

Ross

Comment by Frederico Ferro Schuh [ 22/Sep/15 ]

Hi Ross, I'm also very happy to say I can no longer reproduce both issues on the latest snapshot of the async 3.1 and Rx 1.1 drivers.
Glad I could be of any help, and really appreciate your persistence and efficiency on tracking down these bugs.
Hope I'll be able to help again in the future!

Wish you and the rest of the Mongo team the best.

Comment by Ross Lawley [ 21/Sep/15 ]

I'm happy to say the snapshot of mongodb-driver-async has been published and I can no longer reproduce the issue.

fschuh Many thanks for your patience and help diagnosing this.

Comment by Frederico Ferro Schuh [ 21/Sep/15 ]

Great, glad to know you've found the issue!
I'll be following JAVA-1973 as well, will test it again as soon as you guys have a fix ready.

Comment by Ross Lawley [ 21/Sep/15 ]

OK I think I have found the cause - the new Observable abstraction in 3.1 had a bug in SingleResultCallbackSubscription}}s. It wrongly defaulted the {{completed state. Due to how the data is queued and processed multiple calls to request mean there was a race condition and the completed flag is checked and the Observable is marked as completed before data is returned.

In RxJava 1.0.12 the data was only requested once - hence no bug and it working fine. There was a change in 1.0.13 with regards to how Merge works and that surfaced the bug. I have opened JAVA-1973 with a fix for the helpers and that should propagate and fix the issue in the reactive drivers.

Comment by Ross Lawley [ 21/Sep/15 ]

Great work! I'll update to the latest RxJava and investigate now.

Comment by Frederico Ferro Schuh [ 21/Sep/15 ]

Hi Ross,

I found out why you can't reproduce the issue but it still happens with me.
The Rx driver 1.1.0 SNAPSHOT depends on RxJava version 1.0.12, while I was using a newer version, 1.0.13.
If I force my project to use 1.0.12, then main2 works correctly.
With the newer RxJava releases however, it doesn't work.
RxJava's latest release is 1.0.14, which also causes the issue on main2.

Please have a check with the newer RxJava versions, something must have changed on it that is causing the Rx driver to break.

If you run your QuickStart class with the following dependencies, you can reproduce the issue:

dependencies {
	compile 'io.reactivex:rxjava:1.0.14'
	compile 'org.mongodb:mongodb-driver-rx:1.1.0-SNAPSHOT'
	compile 'io.netty:netty-all:4.0.28.Final'
	compile 'io.vertx:vertx-core:3.0.0'
	compile 'ch.qos.logback:logback-classic:1.1.1'
}

Comment by Ross Lawley [ 18/Sep/15 ]

Hi,

Great that the first issue has been fixed correctly.

Unfortunately, I'm still am unable to duplicate the second issue with the following dependencies:

dependencies {
    compile 'org.mongodb:mongodb-driver-rx:1.1.0-SNAPSHOT'
    compile 'io.netty:netty-all:4.0.28.Final'
    compile 'io.vertx:vertx-core:3.0.0'
    compile 'ch.qos.logback:logback-classic:1.1.1'
}

The code I'm running is here: https://github.com/rozza/mongo-java-quickstart/tree/3.0-rx-vert-test - can you confirm that you can cause it to fail when calling:

./gradlew quickStart --refresh-dependencies

Ross

Comment by Frederico Ferro Schuh [ 18/Sep/15 ]

Hello Ross,

I have tested again, and can confirm the main1 issue no longer occurs with the latest 3.1 SNAPSHOT, as well as the latest release 3.0.4, it appears to have been fixed now.

The main2 issue is still present on the latest RX 1.1 SNAPSHOT driver.
I think you're not being able to reproduce the main2 issue because it's specific to the Rx driver, I can confirm it still works works with the Rx 1.0.1 and either the async driver 3.0.4 or 3.1 SNAPSHOT. It has no relationship to the async driver, but the RX driver, so make sure you're using the Rx driver 1.1 SNAPSHOT.

Comment by Ross Lawley [ 09/Sep/15 ]

Hi fschuh,

Just to let you know a new snapshot has been released with the fix. When testing and refreshing dependencies I'm seeing both cases work as expected.

My test code is here: https://github.com/rozza/mongo-java-quickstart/tree/3.0-rx-vert-test (just so gradle can manage the dependencies). I still can't replicate the failure for the main2 (I test this just by renaming the methods in the test code posted above).

Could you retest and confirm main is fixed for you? Also, if you can provide any further information on the main2 case then that would be really helpful.

Ross

Comment by Ross Lawley [ 08/Sep/15 ]

Thanks fschuh, looks like I missed an important latency based failure case in JAVA-1934.

I've reopened that ticket and have a fix in review and I'll post an update this ticket once that is done. I'm still struggling to reproduce main2 failure but I'll investigate once a new mongodb-async-driver snapshot is released.

Ross

Comment by Frederico Ferro Schuh [ 08/Sep/15 ]

Hi Ross,

I just tried with 3.1 SNAPSHOT dated from 2015/09/07, and I still get the main2 issue.
Note that I'm also using the RX driver 1.1 snapshot.

The main1 issue is still present as well.

Comment by Ross Lawley [ 01/Sep/15 ]

Hi @fschuh,

Can you still reproduce the main2 issue?

Ross

Comment by Ross Lawley [ 26/Aug/15 ]

Hi Frederico,

I renamed main2 to main and always got result: true. I'll leave this ticket open for now and review post JAVA-1934.

Thanks for taking the time to report and help reproduce these issues, its appreciated! As is any further feedback on the RxJava driver.

Ross

Comment by Frederico Ferro Schuh [ 26/Aug/15 ]

Make sure you just run main2, don't run the original main as well.
Did main2 print result: true to you?

Comment by Ross Lawley [ 25/Aug/15 ]

I was unable to reproduce the issue with the latest snapshot and main2, I ran it over 50 times. So leaving that for now.

Comment by Frederico Ferro Schuh [ 25/Aug/15 ]

Thanks Ross, I'll disable read timeout for now. Setting the timeouts on my Observables will work good as a temporary workaround.
I'll be watching JAVA-1934, hoping for a fix when 3.1 is out

Comment by Ross Lawley [ 25/Aug/15 ]

Hi fschuh,

I have found the issue with the NettyStream - currently the read timeout doesn't work the same as it does with the nio2 and is more akin to the MaxIdleTime setting. I've added JAVA-1934 to track that issue.

Currently, for your code I would recommend removing the read timeout and settings the MaxIdleTime setting in ConnectionPoolSettings, that will handle the connection pool resource usage. Then with any reads / operations with MongoDB add timeout to the observable.

I'm now going to look into the 3.1.0-SNAPSHOT issues.

Ross

Comment by Ross Lawley [ 25/Aug/15 ]

Thanks, I'm trying to repo and debug this now.

Comment by Frederico Ferro Schuh [ 25/Aug/15 ]

Hi Ross, thanks for the quick follow up.
I tested 3.1.0-SNAPSHOT and 1.1.0-SNAPSHOT, and they don't fix the issue, in fact they cause another issue as well (I describe it a the end of this comment).

And after digging a bit, I finally found the reason of the Exception.

I've attached a file with a 100% reproduction test.
The issue happens when setting a custom SocketSettings to the MongoClient, and only when running Vertx 3 at the same time.
The SocketSettings's readTimeout attribute is causing the issue. Whatever value I set it to, for example 5 seconds, if I idle for 5 seconds before the next DB operation then the issue will happen.

I failed to reproduce the issue if Vertx 3 is not running.

Also, the attached code contains a main2() method that performs a test that fails on 3.1.0-SNAPSHOT and Rx driver 1.1.0-SNAPSHOT, while it works correctly under 3.0.2 and Rx driver 1.0.1.
The main2() method's failure is unrelated to the original issue, it's an entirely different problem. With the updated driver, MongoCollection.replaceOne() and find() in some cases will not emit any value, causing my Observable to return an empty result and breaking my chain.
Please check the comments in the code for more information.

The Vertx dependency you need in order to run the code (Gradle)
dependencies

{ compile 'io.vertx:vertx-core:3.0.0' }
Comment by Ross Lawley [ 24/Aug/15 ]

Hi fschuh,

Thanks for the ticket, I see you are using Vert.x as well, can you produce any sample code that reproduces this issue? Also, could you test with the 1.1.0-SNAPSHOT? It relies on the 3.1.0-SNAPSHOT MongoDB Java Driver which contains some improvements and fixes to the underlying async driver.

Ross

Generated at Thu Feb 08 08:55:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.