-
Type: Task
-
Resolution: Done
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: Writes
-
None
I have a CSV file that I need to save in mongo. My collection already have some data (couple millions) and I need to save into the database just the new ones and ignore the ones that were already in the collection.
How can i do that? I already have the code below:
Map<String, String> writeOverrides = new HashMap<String, String>(); writeOverrides.put("collection", this.collection); writeOverrides.put("replaceDocument", "false"); writeOverrides.put("ordered", "false"); WriteConfig writeConfig = WriteConfig.create(getJavaSparkContext()).withOptions(writeOverrides); MongoSpark.save(ds.write().mode(SaveMode.Ignore), writeConfig);
I've already tried all the SaveModes and none of them worked the way I need.
PS: I'm using only _id as index