Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-135

Need a way to specify schema in PySpark (Mongo Connector)

    • Type: Icon: Task Task
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Schema
    • Environment:
      Python, Spark 2.0, Pyspark

      Would like to specify a schema for my collection. It has updated schema over time, adding fields, never removing any. Need to specify schema (of latest document) so I don't need to read the full collection to fit to a schema.

      I know this can be done in Scala, but there is no documentation for it in Python.

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            jeremyber Jeremy
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: