Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Unresolved
Priority: Unknown
Fix Version/s: None
Affects Version/s: 10.6.0, 11.0.0
Component/s: Configuration, Reads, Stream
Labels:
- spark
- spark-connector

Assigned Teams:

Java Drivers

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

I would greatly appreciate an option to limit the size of incoming streaming batches, similar to Kafka's maxOffsetsPerTrigger. This would greatly improve the utility of the Mongo Spark Connector for high-volume collections. At B.Well we currently have several collections wherein incoming batches are too large to ingest without using extremely large nodes which are very expensive to run. Although I'm aware we could use the `change.stream.micro.batch.max.partition.count` option, however this is sub-optimal for a number of reasons, starting with the out-of-order processing which would require us to introduce an additional hop (append-only streaming table) between the driver and our target tables in order to sort the changes by clusterTime prior to merging—we'd prefer to keep it simple and reduce overhead by simply merging directly into our replica dataset.

Please drop me a line if you have any question or alternate recommendations—thanks!

cc: seamus.noonan@mongodb.com

Assignee:: Ross Lawley
Reporter:: David Belais
Reviewers:: Almas Abdrazak
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Jan 29 2026 08:43:12 PM UTC
Updated:: Mar 10 2026 03:57:24 PM UTC

Details

Description

Attachments

Activity

People

Dates