Uploaded image for project: 'Python Integrations'
  1. Python Integrations
  2. INTPYTHON-520

[pymongoarrow] Ensure all parquet data types are handled

    • Python Drivers

      Context

      User norrbom reported the following bug:

      • Read from Parquet using pyarrow.parquet.read_table() -> pyarrow.Table
      • PyMongoArrow converts Arrow to Python (via pyarrow.RecordBatch.to_pylist()). Decimal128 becomes Decimal(), which is correct
      • PyMongoArrow converts Python to BSON using the Python BSON module → this is where it goes wrong; BSON cannot handle Decimal().

      We should be able to handle all Parquet Logical types.

      Definition of done

      Add a test that reads a file containing all standard Parquet data types. Add a conversion table to our docs.

      Pitfalls

      Watch out for loss of precision.

            Assignee:
            Unassigned Unassigned
            Reporter:
            steve.silvester@mongodb.com Steve Silvester
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: