Overview
We're having two issues with the Mongo-spark-connector's handling of dates which are related:
1. Writing and then reading an object with a java.sql.Date field results in org.apache.spark.sql.AnalysisException: Cannot up cast <field name> from "TIMESTAMP" to "DATE". I would consider this a bug because you can't read data you just wrote.
2. Reading documents with dates (meaning MongoDB dates, which consist of date, time and time zone) while enabling the Java 8 time API in Spark fails with java.sql.Timestamp is not a valid external type for schema of date. I request mongo-spark-connector supporting this feature flag.
See attached files for complete code samples to reproduce these issues.
Details
In SPARK-340 a change has been introduced that converts MongoDB dates to java.sql.Timestamp while converting the Mongo documents to Spark SQL rows.
Spark is then unable to encode this into a date.
In our team we would prefer to ditch the date and time API from java.sql altogether. Spark has also made efforts to support this. Internally they use the "modern" java.time API (some background). With the configuration parameter spark.sql.datetime.java8API.enabled they also enable serialization and deserialization of java.time.* fields.
The attached sbt project contains two runnable programs to reproduce the described issues.