[CSHARP-4633] Avoiding Errors if Change Stream Events Exceed 16MB Created: 02/May/23  Updated: 31/Oct/23  Resolved: 31/Oct/23

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: None
Fix Version/s: 2.23.0

Type: Improvement Priority: Major - P3
Reporter: PM Bot Assignee: Boris Dogadov
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
Duplicate
is duplicated by CSHARP-4726 Backport: Avoiding Errors if Change S... Closed
Issue split
Server Compat: 7.0
Upstream Changes Summary:

DRIVERS-2617:
MongoDB 7.0 is adding a new aggregation stage $changeStreamSplitLargeEvent, which will split large events (>16MB) into smaller fragments. These fragments will contain as many fields as can fit into 16MB along with the added field:

 

splitEvent: { fragment: 1, of: 3 }

 

Drivers with strongly-typed change stream events will have to add accessors for the new splitEvent field.

Note that it was a design decision to require application authors to programmatically handle the fragments. Each fragment will have its own resume token. Thus application authors will have to handle the case that the pre- and post-images can be returned in separate fragments. We will make no attempt to combine multiple fragments into a single change stream event as this can have additional consequences. For example, let's say a large event was split into 3 fragments, what is the expected behaviour if we are combining the fragments into a single event?

1. Do we block until all three fragments are received?
2. What do we do if we receive 2 fragments and an error? Do we return the two fragments knowing that the third is missing? Do we retry the third and block until received?
3. If users are serializing the change stream events to BSON for further downstream processing or storage into another MongoDB cluster, they would have to re-implement the splitting logic as the combined event would exceed 16MB.

In summary we should expose the splitEvent field and require application authors - who have opted into large event processing - to handle the edge cases explicitly by processing the fragments themselves.

Documentation Changes: Needed
Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
Provide usage example of how to use the API as is, and also an example of a helper method to combine the events (attached in the comment).

2. Would you like the user to see examples of the syntax and/or executable code and its output?
Yes

3. Which versions of the driver/connector does this apply to?
2.23


 Description   

This ticket was split from DRIVERS-2617, please see that ticket for a detailed description.



 Comments   
Comment by Boris Dogadov [ 31/Oct/23 ]

private static ChangeStreamDocument<TDocument> GetNextChangeStreamEvent<TDocument>(IEnumerator<ChangeStreamDocument<TDocument>> changeStreamEnumerator)
{
    changeStreamEnumerator.MoveNext();
    var changeStreamEvent = changeStreamEnumerator.Current;
    if (changeStreamEvent.SplitEvent != null)
    {
        var fragment = changeStreamEvent;
        while (fragment.SplitEvent.Fragment < fragment.SplitEvent.Of)
        {
            changeStreamEnumerator.MoveNext();
            fragment = changeStreamEnumerator.Current;
            MergeFragment(changeStreamEvent, fragment);
        }
    }    return changeStreamEvent;   

static void MergeFragment(
        ChangeStreamDocument<TDocument> changeStreamEvent,
        ChangeStreamDocument<TDocument> fragment)
    {
        foreach (var element in fragment.BackingDocument)
        {
            if (element.Name != "_id" && element.Name != "splitEvent")
            {
                changeStreamEvent.BackingDocument[element.Name] = element.Value;
            }
        }
    }
}

Comment by Githook User [ 31/Oct/23 ]

Author:

{'name': 'BorisDog', 'email': 'BorisDog@users.noreply.github.com', 'username': 'BorisDog'}

Message: CSHARP-4633: Avoiding Errors if Change Stream Events Exceed 16MB (#1209)
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/0e2bc5fe58ee4043105014f5d4d6ae545e350568

Comment by James Kovacs [ 14/Aug/23 ]

NOTE: DRIVERS-2674 / CSHARP-4726 backports support for $changeStreamSplitLargeEvent to MongoDB 6.0.9, but not 6.X rapid releases. Tests should be run on 6.0.9+ (but < 6.1+) and 7.0.0+.

Generated at Wed Feb 07 21:48:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.