-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Atlas Streams
-
Fully Compatible
-
Sprint 59, Sprint 60
See this investigation doc: https://docs.google.com/document/d/1g5d3zq3m4VGGg0F5a7DRLK_oWdf4oOduBgHsHkMjGRM/edit
====
This also happened around 10/3/2024:
Stack trace record:
On azure staging, pod host=streams-spp-canary-86bdd58f58-9v5zm
mongostream thinks it’s running kanopy_immortal_smoke_test_8fc9ad56_08997cbe, but the processor is not processing input messages.
it’s not fully stuck, the runOnce metric is incrementing. but it’s not picking up any new messages. I even tried manually writing to the collection that sources it.
https://victoria-metrics.corp.mongodb.com/select/0/vmui/#/?g0.expr=mongohouse_mstream_runonce_count%7Bprocessor_id%3D%2266fac1acb5cdfac68231529a%22%7D&g0.range_input=3h&g0.end_input=2024-10-03T16%3A46%3A49&g0.tab=0&g0.relative_time=none&g0.tenantID=0
- depends on
-
SERVER-95932 Upgrade mongoc driver version to 1.24.3
- Closed
- related to
-
SERVER-95823 Add a metric to change stream consumer thread
- Closed
-
SERVER-95888 Mitigate stuck changestream bug by throwing exception after period of idleness
- Closed