Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-182

MongoSpark connector silently ignores documents it cannot unmarshal

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.2
    • Component/s: API
    • Labels:
      None

      I just finished debugging my code and discovered this unusual behavior with the java mongospark connector:

      i have an object that contains a primitive boolean field, when i load a collection into a dataset using the following line: ( i just override the collection in the readConfig )

      MongoSpark
      .load( sparkSession, readConfig, MyClass.class ).toDF().as( Encoders.bean( MyClass.class ) )

      this will load only a subset of my collection, it was driving me insane until i discovered that the documents that were not being returned didn't have that field on the db.

      the connector doesnt give any error nor warning of this and i could not find any reference of this in the documentation or online. is this the expected behaviour?

       

      i can provide an example if needed.

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            deathcoder Davide Capozzi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: