-
Type: Task
-
Resolution: Unresolved
-
Priority: Unknown
-
None
-
Affects Version/s: None
-
Component/s: None
Currently dask-mongo uses a dask bag for data, which is essentially a list of dictionaries. This is slow and inefficient, but was needed before we had more data type capabilities in pymongoarrow. We should update dask-mongo to use dataframes directly, using pymongoarrow under the hood.
- depends on
-
INTPYTHON-82 With pymongo>=3.12 pymongoarrow is slower than the naive approach
- Closed