-
Type: Improvement
-
Resolution: Won't Fix
-
Priority: Unknown
-
None
-
Affects Version/s: None
-
Component/s: Source
When a change event in the change stream exceeds the 16MB limit, existing change stream is closed with an exception and new change stream is opened. In a system with a higher update load this will likely miss the change events in the time it takes to start a new change stream. I have 2 proposal for improvement.
Solution #1
The error message of the exception contains the resumeToken of the failed event. Use the "ChangeStream.startAfter(<resumeToken>)" to start the new stream just after the failed event, leading to zero loss of events.
Example error message
BSONObj size: 19001449 (0x121F069) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "826115AEE9000000012B022C0100296E5A1004D148D6B22E8F49B3A65DAE80A4683566463C5F6964003C316663726B36326F6D30303030303030000004" }
Solution #2
Increment the "clusterTime" (introduced in v4.0) available in the MongoCommandException, by 1 ordinal and use it with "ChangeStream.startAtOperationTime(<BsonTimestamp>)"
For sharded cluster, it is possible that multiple events may have same cluster time and this approach can skip few good events with same timestamp as the bad one.
- design is described in
-
KAFKA-381 Support change stream split large events
- Needs Triage
- related to
-
SERVER-55062 Change stream events can exceed 16MB with no workaround
- Closed
-
KAFKA-381 Support change stream split large events
- Needs Triage