[CSHARP-4325] Timeouts / or Unable to authenticate in change streams Created: 16/Sep/22  Updated: 27/Oct/23  Resolved: 03/Oct/22

Status: Closed
Project: C# Driver
Component/s: Change Streams
Affects Version/s: 2.17.1
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: Sasan Pasha Assignee: James Kovacs
Resolution: Gone away Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Quarter: FY22Q4

 Description   

When the change stream is created and the watcher is watching the changes - any interruption in connectivity on the client side for example a mobile app ( from wifi to cellular) and vice versa, causes the WatchAsync method to throw 2 types of exceptions.

The impact is no longer the watcher works and the client won't be able to receive the updates. A suggestion is to make the DB connection more robust and not allow connectivity issues to interrupt the service.

 

Exception1:

 

Error formatting macro: code: java.lang.StackOverflowError

ErrorMessage:A timeout occurred after 30000ms selecting a server using CompositeServerSelector{ Selectors = ReadPreferenceServerSelector{ ReadPreference =

{ Mode : Primary }

}, LatencyLimitingServerSelector

{ AllowedLatencyRange = 00:00:00.0150000 }

, OperationsCountServerSelector }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "

{ ClusterId : 1, EndPoint : "Unspecified/spotlee-westus2.mongo.cosmos.azure.com:10255" }

", EndPoint: "Unspecified/spotlee-westus2.mongo.cosmos.azure.com:10255", ReasonChanged: "Heartbeat", State: "Disconnected", ServerVersion: , TopologyVersion: , Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server. ---> System.Net.Sockets.SocketException: Could not resolve host 'spotlee-westus2.mongo.cosmos.azure.com' at System.Net.Dns.Error_11001 (System.String hostName) [0x00015] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at System.Net.Dns.GetHostByName (System.String hostName) [0x00021] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at System.Net.Dns.GetHostEntry (System.String hostNameOrAddress) [0x00052] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at System.Net.Dns.GetHostAddresses (System.String hostNameOrAddress) [0x00056] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at MongoDB.Driver.Core.Connections.TcpStreamFactory.ResolveEndPoints (System.Net.EndPoint initial) [0x0007b] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.TcpStreamFactory.CreateStream (System.Net.EndPoint endPoint, System.Threading.CancellationToken cancellationToken) [0x00000] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.SslStreamFactory.CreateStream (System.Net.EndPoint endPoint, System.Threading.CancellationToken cancellationToken) [0x00000] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelper (System.Threading.CancellationToken cancellationToken) [0x0000f] in <4ae800ad3ec4467cb63945849677f181>:0 — End of inner exception stack trace — at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelper (System.Threading.CancellationToken cancellationToken) [0x00097] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.BinaryConnection.Open (System.Threading.CancellationToken cancellationToken) [0x00088] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Servers.ServerMonitor.InitializeConnection (System.Threading.CancellationToken cancellationToken) [0x00033] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Servers.ServerMonitor.Heartbeat (System.Threading.CancellationToken cancellationToken) [0x00045] in <4ae800ad3ec4467cb63945849677f181>:0 ", LastHeartbeatTimestamp: "2022-09-16T18:26:32.4418550Z", LastUpdateTimestamp: "2022-09-16T18:26:32.4418570Z" }] }., source:mscorlib, InnerException:, StackTrace: at MongoDB.Driver.Core.Clusters.Cluster.ThrowTimeoutException (MongoDB.Driver.Core.Clusters.ServerSelectors.IServerSelector selector, MongoDB.Driver.Core.Clusters.ClusterDescription description) [0x00013] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Clusters.Cluster+WaitForDescriptionChangedHelper.HandleCompletedTask (System.Threading.Tasks.Task completedTask) [0x00009] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Clusters.Cluster.WaitForDescriptionChangedAsync (MongoDB.Driver.Core.Clusters.ServerSelectors.IServerSelector selector, MongoDB.Driver.Core.Clusters.ClusterDescription description, System.Threading.Tasks.Task descriptionChangedTask, System.TimeSpan timeout, System.Threading.CancellationToken cancellationToken) [0x000ac] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Clusters.Cluster.SelectServerAsync (MongoDB.Driver.Core.Clusters.ServerSelectors.IServerSelector selector, System.Threading.CancellationToken cancellationToken) [0x00110] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Clusters.IClusterExtensions.SelectServerAndPinIfNeededAsync (MongoDB.Driver.Core.Clusters.ICluster cluster, MongoDB.Driver.Core.Bindings.ICoreSessionHandle session, MongoDB.Driver.Core.Clusters.ServerSelectors.IServerSelector selector, System.Threading.CancellationToken cancellationToken) [0x00095] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Bindings.ReadPreferenceBinding.GetReadChannelSourceAsync (System.Threading.CancellationToken cancellationToken) [0x0008c] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.RetryableReadContext.InitializeAsync (System.Threading.CancellationToken cancellationToken) [0x0008d] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.RetryableReadContext.CreateAsync (MongoDB.Driver.Core.Bindings.IReadBinding binding, System.Boolean retryRequested, System.Threading.CancellationToken cancellationToken) [0x000d4] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamOperation`1[TResult].ResumeAsync (MongoDB.Driver.Core.Bindings.IReadBinding binding, System.Threading.CancellationToken cancellationToken) [0x00082] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamCursor`1[TDocument].ResumeAsync (System.Threading.CancellationToken cancellationToken) [0x00083] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamCursor`1[TDocument].MoveNextAsync (System.Threading.CancellationToken cancellationToken) [0x0012b] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.IAsyncCursorExtensions.ForEachAsync[TDocument] (MongoDB.Driver.IAsyncCursor`1[TDocument] source, System.Action`2[T1,T2] processor, System.Threading.CancellationToken cancellationToken) [0x0010b] in <4ae800ad3ec4467cb63945849677f181>:0 at Infrastructure.Database.CosmoMongoDb.HandleChangeStreamAsync[T] (MongoDB.Driver.IMongoDatabase db, Infrastructure.Enums.DatabaseCollectionsEnum collectionName, System.Threading.CancellationToken token) [0x0031b] in <5f17bfe7f847482aabe0b5966c2ebc23>:0
 

Execption2:

ErrorMessage:Unable to authenticate using sasl protocol mechanism SCRAM-SHA-256., source:mscorlib, InnerException:MongoDB.Driver.MongoConnectionException: An exception occurred while sending a message to the server. ---> System.IO.IOException: Unable to read data from the transport connection: Network subsystem is down. ---> System.Net.Sockets.SocketException: Network subsystem is down --- End of inner exception stack trace --- at System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.ThrowException (System.Net.Sockets.SocketError error) [0x00007] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult (System.Int16 token) [0x0001f] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at System.Threading.Tasks.ValueTask+ValueTaskSourceAsTask+<>c.<.cctor>b__4_0 (System.Object state) [0x00030] in <02432465956f42cebcacad2804482ab1>:0 --- End of stack trace from previous location where exception was thrown --- at Mono.Net.Security.MobileAuthenticatedStream.InnerWrite (System.Boolean sync, System.Threading.CancellationToken cancellationToken) [0x000d3] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at Mono.Net.Security.AsyncProtocolRequest.ProcessOperation (System.Threading.CancellationToken cancellationToken) [0x00199] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at Mono.Net.Security.AsyncProtocolRequest.StartOperation (System.Threading.CancellationToken cancellationToken) [0x0008b] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at Mono.Net.Security.MobileAuthenticatedStream.StartOperation (Mono.Net.Security.MobileAuthenticatedStream+OperationType type, Mono.Net.Security.AsyncProtocolRequest asyncRequest, System.Threading.CancellationToken cancellationToken) [0x0024b] in <3a601a1b3c5047a7ab4dafecfdb16515>:0 at MongoDB.Driver.Core.Misc.StreamExtensionMethods.WriteAsync (System.IO.Stream stream, System.Byte[] buffer, System.Int32 offset, System.Int32 count, System.TimeSpan timeout, System.Threading.CancellationToken cancellationToken) [0x00145] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Misc.StreamExtensionMethods.WriteBytesAsync (System.IO.Stream stream, MongoDB.Bson.IO.IByteBuffer buffer, System.Int32 offset, System.Int32 count, System.TimeSpan timeout, System.Threading.CancellationToken cancellationToken) [0x00121] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.BinaryConnection.SendBufferAsync (MongoDB.Bson.IO.IByteBuffer buffer, System.Threading.CancellationToken cancellationToken) [0x00152] in <4ae800ad3ec4467cb63945849677f181>:0 --- End of inner exception stack trace --- at MongoDB.Driver.Core.Connections.BinaryConnection.SendBufferAsync (MongoDB.Bson.IO.IByteBuffer buffer, System.Threading.CancellationToken cancellationToken) [0x0018b] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.BinaryConnection.SendMessagesAsync (System.Collections.Generic.IEnumerable`1[T] messages, MongoDB.Driver.Core.WireProtocol.Messages.Encoders.MessageEncoderSettings messageEncoderSettings, System.Threading.CancellationToken cancellationToken) [0x00221] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.WireProtocol.CommandUsingQueryMessageWireProtocol`1[TCommandResult].ExecuteAsync (MongoDB.Driver.Core.Connections.IConnection connection, System.Threading.CancellationToken cancellationToken) [0x000aa] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Authentication.SaslAuthenticator.AuthenticateAsync (MongoDB.Driver.Core.Connections.IConnection connection, MongoDB.Driver.Core.Connections.ConnectionDescription description, System.Threading.CancellationToken cancellationToken) [0x0013b] in <4ae800ad3ec4467cb63945849677f181>:0 , StackTrace: at MongoDB.Driver.Core.Authentication.SaslAuthenticator.AuthenticateAsync (MongoDB.Driver.Core.Connections.IConnection connection, MongoDB.Driver.Core.Connections.ConnectionDescription description, System.Threading.CancellationToken cancellationToken) [0x00156] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Authentication.AuthenticationHelper.AuthenticateAsync (MongoDB.Driver.Core.Connections.IConnection connection, MongoDB.Driver.Core.Connections.ConnectionDescription description, System.Collections.Generic.IReadOnlyList`1[T] authenticators, System.Threading.CancellationToken cancellationToken) [0x000e5] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.ConnectionInitializer.AuthenticateAsync (MongoDB.Driver.Core.Connections.IConnection connection, MongoDB.Driver.Core.Connections.ConnectionDescription description, System.Threading.CancellationToken cancellationToken) [0x000c2] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Connections.BinaryConnection.OpenHelperAsync (System.Threading.CancellationToken cancellationToken) [0x00229] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.ConnectionPools.ExclusiveConnectionPool+PooledConnection.OpenAsync (System.Threading.CancellationToken cancellationToken) [0x000b1] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.ConnectionPools.ExclusiveConnectionPool+ConnectionCreator.CreateOpenedInternalAsync (System.Threading.CancellationToken cancellationToken) [0x00083] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.ConnectionPools.ExclusiveConnectionPool+ConnectionCreator.CreateOpenedOrReuseAsync (System.Threading.CancellationToken cancellationToken) [0x00207] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.ConnectionPools.ExclusiveConnectionPool+AcquireConnectionHelper.AcquireConnectionAsync (System.Threading.CancellationToken cancellationToken) [0x001b2] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.ConnectionPools.ExclusiveConnectionPool.AcquireConnectionAsync (System.Threading.CancellationToken cancellationToken) [0x00087] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Servers.Server.GetChannelAsync (System.Threading.CancellationToken cancellationToken) [0x000b0] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.RetryableReadContext.InitializeAsync (System.Threading.CancellationToken cancellationToken) [0x00114] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.RetryableReadContext.CreateAsync (MongoDB.Driver.Core.Bindings.IReadBinding binding, System.Boolean retryRequested, System.Threading.CancellationToken cancellationToken) [0x000d4] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamOperation`1[TResult].ResumeAsync (MongoDB.Driver.Core.Bindings.IReadBinding binding, System.Threading.CancellationToken cancellationToken) [0x00082] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamCursor`1[TDocument].ResumeAsync (System.Threading.CancellationToken cancellationToken) [0x00083] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.Core.Operations.ChangeStreamCursor`1[TDocument].MoveNextAsync (System.Threading.CancellationToken cancellationToken) [0x0012b] in <4ae800ad3ec4467cb63945849677f181>:0 at MongoDB.Driver.IAsyncCursorExtensions.ForEachAsync[TDocument] (MongoDB.Driver.IAsyncCursor`1[TDocument] source, System.Action`2[T1,T2] processor, System.Threading.CancellationToken cancellationToken) [0x0010b] in <4ae800ad3ec4467cb63945849677f181>:0 at Infrastructure.Database.CosmoMongoDb.HandleChangeStreamAsync[T] (MongoDB.Driver.IMongoDatabase db, Infrastructure.Enums.DatabaseCollectionsEnum collectionName, System.Threading.CancellationToken token) [0x0031b] in <5f17bfe7f847482aabe0b5966c2ebc23>:0  



 Comments   
Comment by PM Bot [ 03/Oct/22 ]

There hasn't been any recent activity on this ticket, so we're resolving it. Thanks for reaching out! Please feel free to comment on this if you're able to provide more information.

Comment by Sasan Pasha [ 16/Sep/22 ]

thank you so much for your help and support James.

Comment by James Kovacs [ 16/Sep/22 ]

Hi, sasan.pasha@gmail.com,

Each ChangeStreamDocument<T> will have a ResumeToken property (which is the _id of the underlying document). You can simply store the last ResumeToken received from a change stream. When establishing the change stream, you can pass that token in via the ChangeStreamOptions.ResumeAfter property. See Passing options to the Watch and WatchAsync methods.

Please note that Microsoft Azure Cosmos DB is not a MongoDB product and as such is not supported by MongoDB. You should be able to use ChangeStreamDocument<T>.ResumeToken as the resume token. If that doesn't work as expected, please contact Microsoft Support for information on how to correctly resume a change stream with Cosmos DB.

Sincerely,
James

Comment by Sasan Pasha [ 16/Sep/22 ]

thanks James for the prompt response.  I already have the logic in a try/catch block. is there a sample code stream to show me how to resume a token? Please note the cluster is Azure cosmoDB and the data I receive is like this. I am not sure which of the data is the token

 

{
  "_id" : {
    "_data" : new BinData(0, "eyJWIjoyLCJSaWQiOiJqaWMwQU8rLUVJRT0iLCJDb250aW51YXRpb24iOlt7IkZlZWRSYW5nZSI6eyJ0eXBlIjoiRWZmZWN0aXZlIFBhcnRpdGlvbiBLZXkgUmFuZ2UiLCJ2YWx1ZSI6eyJtaW4iOiIiLCJtYXgiOiJGRiJ9fSwiU3RhdGUiOnsidHlwZSI6ImNvbnRpbnVhdGlvbiIsInZhbHVlIjoiXCIzNFwiIn19XX0="),
    "_kind" : 1
  },
  "fullDocument" : {
    "_id" : ObjectId("62439ba0272e409582330653"),
    "partitionKey" : "31c327f2-61b8-4d3f-849a-d3b51220829d",
    }
  },
  "ns" : {
    "db" : "Hubs",
    "coll" : "Bookings"
  },
  "documentKey" : {
    "partitionKey" : "31c327f2-61b8-4d3f-849a-d3b51220829d",
    "_id" : ObjectId("62439ba0272e409582330653")
  }
} 

Comment by James Kovacs [ 16/Sep/22 ]

Hi, sasan.pasha@gmail.com,

Thank you for your question. When you establish a change stream with a cluster, change events are read within the context of a retryable read. Retryable reads currently only retry once on a failed operation. This retry includes server selection thus allowing the client to recover from a crashed cluster node or network partition.

I suspect when you experience a mobile network connectivity interruption that it is lasting longer than the time it takes to retry the operation thus causing the retry to fail as well. You'll notice the inner exception in the authentication-related exception:

System.Net.Sockets.SocketException: Network subsystem is down

The first exception is likely due to the retry failing to an already-connected server. The second exception is likely due to a retry to a server without any available connections in its connection pool.

The recommended way to handle this type of situation is to wrap your change stream in a try/catch block and retry the operation using the resume token (ChangeStreamDocument<T>.ResumeToken) of the last-returned change. You can pass this into the new change stream via ChangeStreamOptions.ResumeAfter. See our Change Stream documentation for more information.

Please let us know if you have any additional questions.

Sincerely,
James

Generated at Wed Feb 07 21:47:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.