MongoSpark connector silently ignores documents it cannot unmarshal

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.2.2
    • Component/s: API
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      I just finished debugging my code and discovered this unusual behavior with the java mongospark connector:

      i have an object that contains a primitive boolean field, when i load a collection into a dataset using the following line: ( i just override the collection in the readConfig )

      MongoSpark
      .load( sparkSession, readConfig, MyClass.class ).toDF().as( Encoders.bean( MyClass.class ) )

      this will load only a subset of my collection, it was driving me insane until i discovered that the documents that were not being returned didn't have that field on the db.

      the connector doesnt give any error nor warning of this and i could not find any reference of this in the documentation or online. is this the expected behaviour?

       

      i can provide an example if needed.

              Assignee:
              Ross Lawley
              Reporter:
              Davide Capozzi
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: