-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 3.9
-
Component/s: None
-
None
After opening a change stream, we got a KeyError exception. This happened even when I tried it without any arguments, i.e. plain
pymongo.database.Database.watch()
with both database change stream and collection change stream.
Pymongo part of stack trace:
File "/usr/local/lib/python3.6/site-packages/pymongo/database.py", line 606, in watch start_after) File "/usr/local/lib/python3.6/site-packages/pymongo/change_stream.py", line 90, in __init__ self._cursor = self._create_cursor() File "/usr/local/lib/python3.6/site-packages/pymongo/change_stream.py", line 167, in _create_cursor explicit_session=self._session is not None) File "/usr/local/lib/python3.6/site-packages/pymongo/change_stream.py", line 161, in _run_aggregation_cmd session) File "/usr/local/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1465, in _retryable_read return func(session, server, sock_info, slave_ok) File "/usr/local/lib/python3.6/site-packages/pymongo/aggregation.py", line 150, in get_cursor self._process_result(result, session, server, sock_info, slave_ok) File "/usr/local/lib/python3.6/site-packages/pymongo/aggregation.py", line 99, in _process_result result, session, server, sock_info, slave_ok) File "/usr/local/lib/python3.6/site-packages/pymongo/change_stream.py", line 148, in _process_result self._start_at_operation_time = result["operationTime"] KeyError: 'operationTime'
This has happened to us at least twice. In both cases performing a manual failover resolved the issue. We're running a PSA setup. We have multiple clusters with the same setup, same client application, and this has happened in 2 different clusters over time, seemingly randomly.
It appears to be python driver issue, because at the time when opening a change stream from python driver failed, the same operation performed from mongo shell worked successfully.
At the time when the python change stream was failing, I found these log messages in mongodb log. It's the watch command that we're executing, it seems relevant:
2020-03-29T10:28:00.095+0000 I COMMAND [conn1030113] command tcz_app.$cmd.aggregate command: aggregate { aggregate: 1, pipeline: [ { $changeStream: { fullDocument: "updateLookup" } }, { $match: { ns.coll: { $in: [ "experiments", "expressions", "running_aggregates", "metrics", "event_segmentations", "companies", "segmentations", "predictions", "shared_cookies", "aggregates" ] } } } ], cursor: {}, lsid: { id: UUID("264249e7-88c3-4fff-8aa7-5ab66e349392") }, $db: "tcz_app", $readPreference: { mode: "primary" } } planSummary: COLLSCAN cursorid:4813959753141157500 keysExamined:0 docsExamined:1 numYields:0 nreturned:0 ok:0 errMsg:"Error waiting for snapshot not less than { ts: Timestamp(1585477665, 14), t: -1 }, current relevant optime is { ts: Timestamp(1585477584, 176), t: 11 }. :: caused by :: operation exceeded time limit" errName:MaxTimeMSExpired errCode:50 reslen:290 locks:{ ReplicationStateTransition: { acquireCount: { w: 2 } }, Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 2 } }, Mutex: { acquireCount: { r: 2 } }, oplog: { acquireCount: { r: 2 } } } protocol:op_msg 15011ms
Python version:
$ python -c "import sys; print(sys.version)" 3.6.10 (default, Feb 26 2020, 16:13:33) [GCC 8.3.0]
Pymongo version:
$ python -c "import pymongo; print(pymongo.version); print(pymongo.has_c())"
3.9.0
True
Mongodb version:
4.2.1 (mongodb:4.2.1 Docker image)
- related to
-
PYTHON-2181 ChangeStreams should account for missing operationTime in aggregate command response
- Closed