-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Unknown
-
None
-
Affects Version/s: 4.7, 4.8
-
Component/s: None
-
None
-
Python Drivers
Detailed steps to reproduce the problem?
We write testcode where we are using a dockerised mongo mongo:latest and have our python program connect to it for doing unit/integration tests.
Upon updating past 4.7.0 we see the test code starting to experience flakeyness, i.e. sometimes fails with a site-packages\pymongo\network.py:377: _OperationCancelled` exception.
Stacktrace of it happening, from our code (using an collection).
trade_item\trade_item_service.py:367: in _log_to_database audit_logs_collection.update_one( ..\..\..\site-packages\pymongo\collection.py:1077: in update_one self._update_retryable( ..\..\..\site-packages\pymongo\collection.py:872: in _update_retryable return self.__database.client._retryable_write( ..\..\..\site-packages\pymongo\mongo_client.py:1569: in _retryable_write return self._retry_with_session(retryable, func, s, bulk, operation, operation_id) ..\..\..\site-packages\pymongo\mongo_client.py:1455: in _retry_with_session return self._retry_internal( ..\..\..\site-packages\pymongo\_csot.py:108: in csot_wrapper return func(self, *args, **kwargs) ..\..\..\site-packages\pymongo\mongo_client.py:1501: in _retry_internal ).run() ..\..\..\site-packages\pymongo\mongo_client.py:2347: in run return self._read() if self._is_read else self._write() ..\..\..\site-packages\pymongo\mongo_client.py:2464: in _write return self._func(self._session, conn, self._retryable) # type: ignore ..\..\..\site-packages\pymongo\collection.py:853: in _update return self._update( ..\..\..\site-packages\pymongo\collection.py:806: in _update result = conn.command( ..\..\..\site-packages\pymongo\helpers.py:327: in inner return func(*args, **kwargs) ..\..\..\site-packages\pymongo\pool.py:1013: in command self._raise_connection_failure(error) ..\..\..\site-packages\pymongo\pool.py:985: in command return command( ..\..\..\site-packages\pymongo\network.py:202: in command reply = receive_message(conn, request_id) ..\..\..\site-packages\pymongo\network.py:317: in receive_message length, _, response_to, op_code = _UNPACK_HEADER(_receive_data_on_socket(conn, 16, deadline)) ..\..\..\site-packages\pymongo\network.py:394: in _receive_data_on_socket wait_for_read(conn, deadline) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ conn = Connection(<socket.socket [closed] fd=-1, family=2, type=1, proto=0>) CLOSED at 1338676934672, deadline = None def wait_for_read(conn: Connection, deadline: Optional[float]) -> None: """Block until at least one byte is read, or a timeout, or a cancel.""" sock = conn.conn timed_out = False # Check if the connection's socket has been manually closed if sock.fileno() == -1: return while True: # SSLSocket can have buffered data which won't be caught by select. if hasattr(sock, "pending") and sock.pending() > 0: readable = True else: # Wait up to 500ms for the socket to become readable and then # check for cancellation. if deadline: remaining = deadline - time.monotonic() # When the timeout has expired perform one final check to # see if the socket is readable. This helps avoid spurious # timeouts on AWS Lambda and other FaaS environments. if remaining <= 0: timed_out = True timeout = max(min(remaining, _POLL_TIMEOUT), 0) else: timeout = _POLL_TIMEOUT readable = conn.socket_checker.select(sock, read=True, timeout=timeout) if conn.cancel_context.cancelled: > raise _OperationCancelled("operation cancelled") E pymongo.errors._OperationCancelled: operation cancelled
In the test we usually get this problem through using either insert or update.
Between the tests we also drop the whole database between tests - and my hunch would be that the problem occurs from here somewhere, i.e. the dropping of databases between tests and then new tests using the same MongoClient to reach a database/collection.
The tests are written as they expect the database to be totally empty, and even some tests check and expect that indexes are or are not created. Which makes the easy solution to drop databases/collections between tests.
Definition of done: what must be done to consider the task complete?
Find whatever is causing the issue, and fix it would be nice. As one would not want to use packages where builds are failing "randomly" at quite high chance.
The exact Python version used, with patch level:
$ python -c "import sys; print(sys.version)"
3.11.8 (tags/v3.11.8:db85d51, Feb 6 2024, 22:03:32) [MSC v.1937 64 bit (AMD64)]
(The issue is also found in our CI, which is 3.11)
The exact version of PyMongo used, with patch level:
$ python -c "import pymongo; print(pymongo.version); print(pymongo.has_c())"
4.7.0
True
AND
4.8.0
True
Describe how MongoDB is set up. Local vs Hosted, version, topology, load balanced, etc.
Windows 11, and Linux (Dockerized in a CI)
No frameworks or asynchronous network libraries used. ...)
Although we have installed the `aws` module, but it is probably not used in tests.
—
Tests work as expected in all versions before 4.7.0 thus now we are updating to 4.6.3 as that is the highest version we found without our tests and CI failing due to the issue.
- causes
-
SERVER-93716 Revert to use pymongo v4.3.3
- Closed
- is caused by
-
PYTHON-3175 Preemptively cancel in progress operations when SDAM heartbeats timeout
- Closed