[JAVA-4546] Reactive change stream stuck after connection lost for longer than server selection timeout Created: 23/Mar/22  Updated: 04/May/22  Resolved: 23/Mar/22

Status: Closed
Project: Java Driver
Component/s: Change Streams, Cluster Management, Reactive Streams
Affects Version/s: 4.3.0
Fix Version/s: None

Type: Bug Priority: Unknown
Reporter: Dom S Assignee: Jeffrey Yemin
Resolution: Duplicate Votes: 0
Labels: external-user
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates JAVA-4432 The Subscriber of a publisher never g... Closed

 Description   

Summary

When the db connection is lost for more than 30 seconds, the change stream stops emitting items and does not propagate any error. It is stuck.

Please provide the version of the driver. If applicable, please provide the MongoDB server version and topology (standalone, replica set, or sharded cluster).

We are using the reactivestreams driver and creating a Flowable from the change stream publisher.

The problem seems to be introduced with driver 4.3.0. Using the version before (4.2.3), the following error is received in the Flowable and it can be handled correctly (restart of the watch):

com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27123, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.nio.channels.InterruptedByTimeoutException}}]com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=localhost:27123, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.nio.channels.InterruptedByTimeoutException}}] at com.mongodb.internal.connection.BaseCluster.createTimeoutException(BaseCluster.java:413) at com.mongodb.internal.connection.BaseCluster.handleServerSelectionRequest(BaseCluster.java:314) at com.mongodb.internal.connection.BaseCluster.access$800(BaseCluster.java:62) at com.mongodb.internal.connection.BaseCluster$WaitQueueHandler.run(BaseCluster.java:484) at java.base/java.lang.Thread.run(Thread.java:833)

How to Reproduce

  • Create a Flowable from a mongo watch and subscribe to it
  • Disconnect the server (e.g. shut down) and connect again after more than 30 seconds
  • Change something in the collection / db
  • Observe that neither the onError or onNext is called in the Flowable.
  • In 4.2.3, onError will be called

Additional Background

I tested versions up to 4.5.0 mongo driver and the issue is not fixed.



 Comments   
Comment by Jeffrey Yemin [ 23/Mar/22 ]

FYI, 4.5.1 has been released.

Comment by Dom S [ 23/Mar/22 ]

Hello Jeffrey,

Yes it seems it is fixed with 4.5.1-SNAPSHOT. So I think this ticket can be closed and hope 4.5.1 will be released soon

Thanks again for your help.

Comment by Jeffrey Yemin [ 23/Mar/22 ]

Hi nothanks@trash-mail.com

We think this is a duplicate of JAVA-4432. Can you test against the 4.5.1-SNAPSHOT release to confirm? You'll have to add the Sonatype snapshot repository repository to grab the artifacts.

Comment by Dom S [ 23/Mar/22 ]

Hello Jeffrey,

Thanks for looking into it. Here is the info you requested

 

Server version: mongodb docker image 4.4.12

Topology: Single node replica

Connection string: mongodb://[internal ip]:27025

Default MongoClientSettings except serverSettings.heartbeatFrequency 10 seconds

To disconnect and reconnect I stopped the docker container and started it again (docker stop xxxx, wait 35 seconds, docker start xxxx)

Comment by Jeffrey Yemin [ 23/Mar/22 ]

Hi nothanks@trash-mail.com,

Thanks for the report. Can you add a few more details to help us reproduce the issue?

  • The server version and topology type (replica set, sharded)
  • The connection string (or MongoClientSettings) used to create the MongoClient
  • The precise steps you took to disconnect and reconnect (did you shut down all the servers at once, one at a time, etc)
Generated at Thu Feb 08 09:02:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.