[CSHARP-2146] System.IO.EndOfStreamException when iterating change stream cursor Created: 11/Jan/18  Updated: 04/Apr/23  Resolved: 25/Jun/20

Status: Closed
Project: C# Driver
Component/s: Connectivity, Read Operations
Affects Version/s: 2.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Andrii Litvinov Assignee: Wan Bachtiar
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backported by CSHARP-1059 RunCommand replSetReconfig System.IO.... Closed
Related
is related to SERVER-32496 Start tailing from the beginning of c... Backlog
Case:

 Description   

I got following exception when tried to resume watching changes in collection with new change stream feature in MongoDB 3.6.

This error happened on 2 out of 6 collections I watched. It started occurring after 6-12 hours and it worked normally before. To resolve it I needed to restart watching for the events without resume token which is unfortunate because it can cause loss of information in my system.

System.IO.EndOfStreamException: Attempted to read past the end of the stream.
at void MongoDB.Bson.IO.BsonStreamExtensions.ReadBytes(BsonStream stream, byte[] buffer, int offset, int count)
at int MongoDB.Bson.IO.BsonStreamAdapter.ReadInt32()
at int MongoDB.Bson.IO.BsonBinaryReader.ReadSize()
at void MongoDB.Bson.IO.BsonBinaryReader.ReadStartDocument()
at BsonDocument MongoDB.Bson.Serialization.Serializers.BsonDocumentSerializer.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TBsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializerBase<TBsonValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at BsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializer.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TBsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializerBase<TBsonValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at BsonDocument MongoDB.Bson.Serialization.Serializers.BsonDocumentSerializer.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TBsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializerBase<TBsonValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at BsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializer.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TBsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializerBase<TBsonValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at BsonDocument MongoDB.Bson.Serialization.Serializers.BsonDocumentSerializer.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TBsonValue MongoDB.Bson.Serialization.Serializers.BsonValueSerializerBase<TBsonValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at ChangeStreamDocument<TDocument> MongoDB.Driver.ChangeStreamDocumentSerializer<TDocument>.DeserializeValue(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.Serializers.SealedClassSerializerBase<TValue>.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
at TValue MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize<TValue>(IBsonSerializer<TValue> serializer, BsonDeserializationContext context)
at TDocument MongoDB.Driver.Core.Operations.ChangeStreamCursor<TDocument>.DeserializeDocument(RawBsonDocument rawDocument)
at IEnumerable<TDocument> MongoDB.Driver.Core.Operations.ChangeStreamCursor<TDocument>.DeserializeDocuments(IEnumerable<RawBsonDocument> rawDocuments)
at void MongoDB.Driver.Core.Operations.ChangeStreamCursor<TDocument>.ProcessBatch(bool hasMore)
at async Task<bool> MongoDB.Driver.Core.Operations.ChangeStreamCursor<TDocument>.MoveNextAsync(CancellationToken cancellationToken)



 Comments   
Comment by Rachelle Palmer [ 25/Jun/20 ]

Hi there, thank you for reaching out to MongoDB. As this sounds more like a support issue, I wanted to give you some resources to get this question answered more quickly:

  • our MongoDB support portal, located at support.mongodb.com
  • our MongoDB community forums, located here
  • If you are an Atlas customer, there is free support offered 24/7 in the lower right hand corner of the UI.

Thank you!
Rachelle

Comment by Andrii Litvinov [ 17/Apr/19 ]

Hi Wan,

Thank you for the comment. Replica set didn't fail over during the issue. The symptoms were that I started collection.watch and after some hours (days?) I got the exception that I posted in the description. After this exception my service tried to restart watching but then the error that you suggested happened (Command aggregate failed: The resume token UUID does not exist. Has the collection been dropped?). Could be that the service was watching for the collection for a long time and no changes happened to the collection for a time longer than the oplog window.

I saw this exception multiple times during the testing of the service in staging environment, but I cannot provide steps to reproduce. Only thing that I can tell that the collections were not often modified and service was meant to be running without restart for extended period, like always.

>Change Streams is built on top of oplog. If the last operation seen has fallen off the oplog, tailing oplog is essentially skipping some operations as well. i.e. similar to just calling Watch() without token.
The difference with tailing oplog is that I could start with the oldest operation and with change stream I could start only with the latest. That's why we decided to drop change stream implementation and are using oplog tailing up until now. That's why I cannot tell if the issue was fixed in 2.7 release.

Cheers,
Andrii

Comment by Wan Bachtiar [ 11/Apr/19 ]

It seems that issue happened when there were no operations in oplog for the token I supplied

Hi,

If you call Collection.Watch() (both in v2.5 and v2.7) with a resume token that no longer exists in the oplog, you should be getting MongoCommandException with the following message, in MongoDB v3.6:

Unhandled Exception: System.AggregateException: One or more errors occurred. (Command aggregate failed: The resume token UUID does not exist. Has the collection been dropped?.) ---> MongoDB.Driver.MongoCommandException: Command aggregate failed: The resume token UUID does not exist. Has the collection been dropped?.

Are you able to consistently reproduce the error ? If so, would you be able to provide:

  • Minimal and complete code snippets
  • Steps to reproduce the error

Also, please test against MongoDB C# driver v2.7+

That's why I decided to change implementation of my service to oplog tailing instead of change stream

Change Streams is built on top of oplog. If the last operation seen has fallen off the oplog, tailing oplog is essentially skipping some operations as well. i.e. similar to just calling Watch() without token.

Based on the description of CSHARP-2267 I believe it is. Just to be sure, after the fix, some different error will be thrown, right?

The patch in CSHARP-2267 is to ensure that getMore() after an election is retried (This is actually not related to the use of resume token). Without knowing exactly how the issue you're having is triggered, it's difficult to conclude that this is the same issue. In v2.7.1+, Watch() would be retryable without error after replica set failovers.

Regards,
Wan.

Comment by Andrii Litvinov [ 29/Jan/19 ]

Hey Ian thank you getting back to me. It seems that issue happened when there were no operations in oplog for the token I supplied. Means I couldn't successfully resume the change stream and might miss new events. That's why I decided to change implementation of my service to oplog tailing instead of change stream. So I cannot confirm that the issue if fixed for me. Based on the description of  CSHARP-2267 I believe it is.

 

Just to be sure, after the fix, some different error will be thrown, right?

Comment by Ian Whalen (Inactive) [ 28/Jan/19 ]

Andrii Litvinov could you please test against 2.7.3? Given that you were seeing this under 2.5, we believe we have fixed this issue since (see CSHARP-2267).

Generated at Wed Feb 07 21:41:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.