-
Type: Task
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Atlas Streams
- There are some places we use tassert that might fail due to transient network errors. We should change these to uasserts to avoid the noise in #asp-engine-warnings channel. (unless we want the stack traces to show up in that channel). An example is the "KafkaConnectAuthCallback::readSocketData received less data than" assert discussed in this slack thread: https://mongodb.slack.com/archives/C07R76FQ3UJ/p1729873119093639?thread_ts=1729872106.807049&cid=C07R76FQ3UJ.
- We should improve the error message below to indicate there is a transient network error occuring. There are some other places as well.
tassert(ErrorCodes::InternalError,
str::stream() << "KafkaConnectAuthCallback::readSocketData received less data than "
"expected, received a total of "
<< bytesStored << " bytes, and expected " << readBuffer.size(),
bytesStored == readBuffer.size());
Make a follow up ticket for:
- Currently when the KafkaConnectAuthCallback and KafkaResolveCallback classes throw an InternalError, it gets bubbled up through librdkafka. Our code in KafkaPartitionConsumer / KafkaEmitOperator eventuallys throws a StreamProcessorKafkaConnectionError, which is considered a user error. If we want to be alerted on "internal errors" for VPC peering... we should amend this flow to ultimately throw a different error code (StreamProcessorVPCConnectionError ?)
- This error will happen from time to time. So we can also increase the alert threshold when we do this: https://github.com/10gen/mongohouse/blob/master/alerts/mhouse_streams/mhouse-streams-prod.yaml#L195
- is related to
-
SERVER-92701 Propagate more specific rdkafka connection errors to user-facing assertions
- Closed