-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Unknown
-
Affects Version/s: None
-
Component/s: pymongoarrow
-
Python Drivers
Context
User norrbom reported the following bug:
- Read from Parquet using pyarrow.parquet.read_table() -> pyarrow.Table
- PyMongoArrow converts Arrow to Python (via pyarrow.RecordBatch.to_pylist()). Decimal128 becomes Decimal(), which is correct
- PyMongoArrow converts Python to BSON using the Python BSON module → this is where it goes wrong; BSON cannot handle Decimal().
We should be able to handle all Parquet Logical types.
Definition of done
Add a test that reads a file containing all standard Parquet data types. Add a conversion table to our docs.
Pitfalls
Watch out for loss of precision.