-
Type:
Task
-
Resolution: Fixed
-
Priority:
Unknown
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Pandas just released 2.0.0rc0, we should ensure we are compatible. I ran against the ARROW-15 branch and got the following errors:
======================================================= FAILURES =======================================================
____________________________________________ TestExplicitPandasApi.test_csv ____________________________________________
self = <test.test_pandas.TestExplicitPandasApi testMethod=test_csv>
def test_csv(self):
# Pandas csv does not support nested data.
# cf https://github.com/pandas-dev/pandas/issues/40652
_, data = self._create_data()
for name in data.columns.to_list():
if isinstance(data[name].dtype, PandasBSONDtype):
data = data.drop(labels=[name], axis=1)
with tempfile.NamedTemporaryFile(suffix=".csv") as f:
f.close()
# May give RuntimeWarning due to the nulls.
with warnings.catch_warnings():
warnings.simplefilter("ignore", RuntimeWarning)
data.to_csv(f.name, index=False, na_rep="")
out = pd.read_csv(f.name)
> self._assert_frames_equal(data, out)
test/test_pandas.py:315:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test/test_pandas.py:108: in _assert_frames_equal
pd.testing.assert_series_equal(in_col, out_col)
pandas/_libs/testing.pyx:52: in pandas._libs.testing.assert_almost_equal
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E AssertionError: Series are different
E
E Series values are different (33.33333 %)
E [index]: [0, 1, 2]
E [left]: [a0, a1, None]
E [right]: [a0, a1, nan]
E At positional index 2, first diff: None != nan
pandas/_libs/testing.pyx:172: AssertionError
__________________________________________ TestSetitem.test_setitem_2d_values __________________________________________
self = <test.pandas_types.test_binary.TestSetitem object at 0x14c9dc820>
data = <PandasBinaryArray>
[ Binary(b'0.02177712209590621', 10), Binary(b'0.41848933357903795', 10),
Binary(b'0.1320307731... nan,
Binary(b'0.41125169736048484', 10), Binary(b'0.59626778121896', 10)]
Length: 100, dtype: bson_Binary[10]
def test_setitem_2d_values(self, data):
# GH50085
original = data.copy()
df = pd.DataFrame({"a": data, "b": data})
df.loc[[0, 1], :] = df.loc[[1, 0], :].values
> assert (df.loc[0, :] == original[1]).all()
E AssertionError
../../../.venvs/mongo-arrow/lib/python3.10/site-packages/pandas/tests/extension/base/setitem.py:427: AssertionError
=================================================== warnings summary ===================================================
test/pandas_types/test_binary.py::TestSetitem::test_setitem_2d_values
/Users/steve.silvester/workspace/mongo-arrow/bindings/python/pymongoarrow/pandas_types.py:150: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
return self.data == other
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================== short test summary info ================================================
FAILED test/test_pandas.py::TestExplicitPandasApi::test_csv - AssertionError: Series are different
FAILED test/pandas_types/test_binary.py::TestSetitem::test_setitem_2d_values - AssertionError