-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Component/s: Networking
-
None
-
Needed
Why this matters for triage: This work is part of the Atlas Availability program and addresses a recognized canonical failure scenario (dns-connectivity-tls-issue) in the Availability canonical scenarios doc.
These connection failures are false-positive availability incidents: the cluster is healthy, but the customer experiences a full production outage and perceives Atlas as broken until support gets involved. From the customer's perspective, time-to-resolution is identical to a real availability incident.
Impact today:
- Recurring source of HELP/support escalations driven by customer-side misconfiguration (IP access list, custom DNS resolvers, TLS).
- Long resolution time because the driver's low-level error gives the customer no actionable next step, so they wait on support instead of self-fixing.
- Counts against perceived Atlas availability even though the platform itself is healthy.
When customers fail to connect to an Atlas cluster due to common network misconfigurations, the driver currently surfaces generic low-level errors. This makes it hard for customers to self-diagnose, leading to unnecessary support escalations.
Proposal: When a connection error occurs and the connection string is an Atlas connection string, enhance the error message with a hint and a link for known failure patterns. Each link would point to a dedicated page with a detailed explanation of the error, diagnostic commands to run, and how to interpret the output.
To avoid creating a hard dependency on docs team URL stability, links would use backend-configurable short URLs (e.g. mongodb.com/link/<error-slug>) that can be redirected to the current canonical page at any time — similar to the pattern used by React dev errors.
Example scenarios (to be scoped in detail in a follow-up shared doc):
- Connection refused / timeout + Atlas connection string → hint to check IP Access List, link to mongodb.com/link/atlas-ip-access-list
- SRV record resolution failure + Atlas connection string → hint to check DNS/custom resolver, link to mongodb.com/link/atlas-dns-resolution (e.g. page includes a dig command and what the expected output should look like)
Next steps: If drivers can prioritize this, we'll work together to agree on a concrete list of scenarios and their associated messages/links.
- is related to
-
DRIVERS-2023 Add section to troubleshooting FAQ per driver with top SEO results
-
- Implementing
-