Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-315

Support distributed writes

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Do
    • Priority: Icon: Trivial - P5 Trivial - P5
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      Currently the Spark connector uses the DataWriter API for writing. It could support BatchWrite commit method to finalize writes.

      This distributed write API would require a temporary table for task writes from the DataWriter to be collated and then a final move of data in the BatchWrite from the temporary table.

      Notes: This would only work for insert style operations and not update / replace operations. It is a 2 phased write that would add latency but ensures a final commit or abort step. Currently, as there is no abort operations in the DataWriter cannot be undone.

            Assignee:
            Unassigned Unassigned
            Reporter:
            ross@mongodb.com Ross Lawley
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: