-
Type: Improvement
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
Add overloaded Java methods so it is possible to load data from MongoDB into a Dataset<Row> using only SparkSession (and optionally a ReadConfig), without having to provide a TypeTag<D> or Class<D>). This will also enable users to stop using the JavaSparkContext.
Currently, the following load methods accept a SparkSession object:
MongoSpark.load(SparkSession, TypeTag<D>)
MongoSpark.load(SparkSession, ReadConfig, Class<D>)
MongoSpark.load(SparkSession, ReadConfig, TypeTag<D>)
MongoSpark.load(SparkSession, ReadConfig, TypeTag<D>, DefaultsTo<D,Document>)
Desired load methods:
Dataset<Row> ds1 = MongoSpark.load(sparkSession);
Dataset<Row> ds2 = MongoSpark.load(sparkSession, readConfig);
Questions:
- Do we want to add a method that takes a Class<D> without a ReadConfig, e.g.,
MongoSpark.load(SparkSession, Class<D>) ?