Uploaded image for project: 'Python Integrations'
  1. Python Integrations
  2. INTPYTHON-165

Auto schema detection can yield different table on missing values

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Unknown Unknown
    • pymongoarrow-1.6
    • Affects Version/s: None
    • Component/s: Schemas
    • None
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      Discovered in ARROW-158, our auto schema detection can yield different table on missing values:

          def test_auto_schema_missing_values(self):
              docs = [
                  {"a": []},
                  {"a": ["str"]},
                  {"a": []},
              ]
              self.coll.delete_many({})
              self.coll.insert_many(docs)
              actual = find_arrow_all(self.coll, {}, projection={"_id": 0})
              expected = find_arrow_all(self.coll, {}, projection={"_id": 0}, schema=Schema({"a": list_(string())}))
              self.assertEqual(actual.schema, expected.schema)
              self.assertEqual(actual, expected)
      

      Output:

      >       self.assertEqual(actual, expected)
      E       AssertionError: pyarr[19 chars]em: string>
      E         child 0, item: string
      E       ----
      E       a: [[null,["str"],[]]] != pyarr[19 chars]em: string>
      E         child 0, item: string
      E       ----
      E       a: [[[],["str"],[]]]
      
      test/test_arrow.py:488: AssertionError
      

            Assignee:
            steve.silvester@mongodb.com Steve Silvester
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: