Details
-
Task
-
Resolution: Fixed
-
Major - P3
-
None
-
None
-
None
Description
Description
New releases
2.4.1, 2.3.3, 2.2.7, 2.1.6
ChangeLog (same for all releases)
- Ensure nullable fields or container types accept null values
- Added ReadConfig.batchSize property
- Renamed system property spark.mongodb.keep_alive_ms to mongodb.keep_alive_ms
- Added MongoDriverInformation to the default MongoClient
- Update to latest Java driver (3.10.+)
- Update PartitionerHelper.matchQuery - no longer includes $ne/$exists checks
- Added logging of partitioner and their queries
- Added WriteConfig.extendedBsonTypes setting, so users can disable extended bson types when writing.
- Added Java spi can now use short form: spark.read.format("mongo")
Source: https://github.com/mongodb/mongo-spark/blob/master/doc/7-Changelog.md#241
New Configuration Options:
Input Configuration:
- batchSize - the optional size for the internal batches used within the cursor
Output Configuration
- extendedBsonTypes - enables extended Bson Types when writing data to MongoDB. Default: true
Cache Configuration
- spark.mongodb.keep_alive_ms renamed to mongodb.keep_alive_ms
Other changes
Can now use: spark.read.format("mongo")
- Replace all instances of: format("com.mongodb.spark.sql") with format("mongo")
- Replace all instances of: format("com.mongodb.spark.sql.DefaultSource") with format("mongo")