-
Type: Improvement
-
Resolution: Won't Do
-
Priority: Trivial - P5
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
Currently the Spark connector uses the DataWriter API for writing. It could support BatchWrite commit method to finalize writes.
This distributed write API would require a temporary table for task writes from the DataWriter to be collated and then a final move of data in the BatchWrite from the temporary table.
Notes: This would only work for insert style operations and not update / replace operations. It is a 2 phased write that would add latency but ensures a final commit or abort step. Currently, as there is no abort operations in the DataWriter cannot be undone.