-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
What did I use
- Databricks Runtime Version 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)
- org.mongodb.spark:mongo-spark-connector:10.0.1
- MongoDB 5.0
What did I do
I tried to load the following value from mongodb to databricks
[{
"_id": {
"$oid": "6289e26430540f2e5db55f3c"
},
"username": "tmp_user_3",
"attributes": []
},{
"_id": {
"$oid": "6289e26430540f2e5db55f3f"
},
"username": "tmp_user_4",
"attributes": null
},{
"_id": {
"$oid": "6289e26430540f2e5db55f3d"
},
"username": "tmp_user_2",
"attributes": [
{
"key": "c",
"value": 3
}
]
},{
"_id": {
"$oid": "6289e26430540f2e5db55f3e"
},
"username": "tmp_user_1",
"attributes": [
{
"key": "a",
"value": 1
},
{
"key": "b",
"value": 2
}
]
}]
( spark .read .format("mongodb") .option("database", database) .option("collection", collection) .option("connection.uri", connection_uri) .load() .display() )
What did I get
the data can't be read from mongodb
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 79.0 failed 1 times, most recent failure: Lost task 0.0 in stage 79.0 (TID 304) (ip-10-172-164-192.us-west-2.compute.internal executor driver): com.mongodb.spark.sql.connector.exceptions.DataException: Invalid field: 'attributes'. The dataType 'array' is invalid for 'BsonNull'.
What do I expect
The dataframe is displayed
- related to
-
SPARK-351 array field with null value is not written to mongodb
-
- Closed
-