Version 11 of the Mongo Spark Connector breaks when reading from a stream with Spark Connect—looks like a serialization issue. I've tested this using pyspark 4.0 and 4.1 + Databricks connect DBR 17.2, DBR 17.3, and DBR 18.0. Here's an example of the error:
Â
pyspark.errors.exceptions.connect.SparkException: Job aborted due to stage failure: Task 5 in stage 122.0 failed 4 times, most recent failure: Lost task 5.3 in stage 122.0 (TID 480) (172.21.12.249 executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.generic.DefaultSerializationProxy to field
(full stack trace attached)