-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Writes
-
None
We are trying to do "upsert" to documents in MongoDB which have a unique index (both single column and composite index). These indexes are separate from default "_id" index. The "replaceDocument" works great when we are dealing with only default "_id" unique index.
What is the correct way to achieve upserts to documents with unique indexes other than "_id"? Is there a "mapping_id" concept where we can tell Mongo-Spark connector to perform upserts on them?
The current error we get is a standard duplicate key error
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 16.0 failed 1 times, most recent failure: Lost task 0.0 in stage 16.0 (TID 209, localhost, executor driver): com.mongodb.MongoBulkWriteException: Bulk write operation error on server xx.xx.xx.xx:27017. Write errors: [BulkWriteError{index=2, code=11000, message='E11000 duplicate key error collection: ekg.datapull_test index: name_-1 dup key: { : "a" }', details={ }}]. at com.mongodb.connection.BulkWriteBatchCombiner.getError(BulkWriteBatchCombiner.java:176) at