Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-135

Need a way to specify schema in PySpark (Mongo Connector)

    XMLWordPrintable

    Details

    • Type: Question
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Schema
    • Environment:
      Python, Spark 2.0, Pyspark
    • # Replies:
      1
    • Last comment by Customer:
      false

      Description

      Would like to specify a schema for my collection. It has updated schema over time, adding fields, never removing any. Need to specify schema (of latest document) so I don't need to read the full collection to fit to a schema.

      I know this can be done in Scala, but there is no documentation for it in Python.

        Attachments

          Activity

            People

            • Assignee:
              ross.lawley Ross Lawley
              Reporter:
              jeremyber Jeremy
              Participants:
              Last commenter:
              Rathi Gnanasekaran
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                2 years, 28 weeks, 6 days ago
                Date of 1st Reply: