Uploaded image for project: 'C Driver'
  1. C Driver
  2. CDRIVER-4634

Avoiding Errors if Change Stream Events Exceed 16MB

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Hide

      DRIVERS-2617:
      MongoDB 7.0 is adding a new aggregation stage $changeStreamSplitLargeEvent, which will split large events (>16MB) into smaller fragments. These fragments will contain as many fields as can fit into 16MB along with the added field:

       

      splitEvent: { fragment: 1, of: 3 }

       

      Drivers with strongly-typed change stream events will have to add accessors for the new splitEvent field.

      Note that it was a design decision to require application authors to programmatically handle the fragments. Each fragment will have its own resume token. Thus application authors will have to handle the case that the pre- and post-images can be returned in separate fragments. We will make no attempt to combine multiple fragments into a single change stream event as this can have additional consequences. For example, let's say a large event was split into 3 fragments, what is the expected behaviour if we are combining the fragments into a single event?

      1. Do we block until all three fragments are received?
      2. What do we do if we receive 2 fragments and an error? Do we return the two fragments knowing that the third is missing? Do we retry the third and block until received?
      3. If users are serializing the change stream events to BSON for further downstream processing or storage into another MongoDB cluster, they would have to re-implement the splitting logic as the combined event would exceed 16MB.

      In summary we should expose the splitEvent field and require application authors - who have opted into large event processing - to handle the edge cases explicitly by processing the fragments themselves.

      Show
      DRIVERS-2617: MongoDB 7.0 is adding a new aggregation stage  $changeStreamSplitLargeEvent , which will split large events (>16MB) into smaller fragments. These fragments will contain as many fields as can fit into 16MB along with the added field:   splitEvent: { fragment: 1, of: 3 }   Drivers with strongly-typed change stream events will have to add accessors for the new  splitEvent  field. Note that it was a design decision to require application authors to programmatically handle the fragments. Each fragment will have its own resume token. Thus application authors will have to handle the case that the pre- and post-images can be returned in separate fragments. We will make no attempt to combine multiple fragments into a single change stream event as this can have additional consequences. For example, let's say a large event was split into 3 fragments, what is the expected behaviour if we are combining the fragments into a single event? 1. Do we block until all three fragments are received? 2. What do we do if we receive 2 fragments and an error? Do we return the two fragments knowing that the third is missing? Do we retry the third and block until received? 3. If users are serializing the change stream events to BSON for further downstream processing or storage into another MongoDB cluster, they would have to re-implement the splitting logic as the combined event would exceed 16MB. In summary we should expose the  splitEvent  field and require application authors - who have opted into large event processing - to handle the edge cases explicitly by processing the fragments themselves.

      This ticket was split from DRIVERS-2617, please see that ticket for a detailed description.

            Assignee:
            Unassigned Unassigned
            Reporter:
            dbeng-pm-bot PM Bot
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: