Loading...

XML

Word

Printable

JSON

Type: Spec Change
Resolution: Unresolved
Priority: Unknown
Fix Version/s: None
Component/s: Initial DNS Seedlist Discovery, SRV Polling
Labels:
None

Driver Changes:
Needed
Case:

Summary

Provide way to prefer TCP for SRV lookup

Background & Motivation

DNS resolution is expected to first try with UDP, then retry with TCP if the UDP response indicates truncation.

HELP-59749 notes a case where a customer observed a subset of SRV records returned in the UDP response, but the truncation flag (TC bit) was not set:

TCP fallback does not work on their DNS records because the DNS server does not support TC bit in the response header.

As a result, a changing subset of SRV records was applied each time SRV records are polled. I expect this results in repeated closing/opening of connections as servers are removed/added.

DNS and Truncation in UDP suggests this may not be isolated to the customer:

some 72,000 cases (91% of all such cases) where the resolver appears to be using truncated DNS response data occur for users located in just three networks, all located in China.

Proposal: add way to opt-in to using TCP to resolve SRV records first (rather than on retry). Consider adding a URI option: srvPreferTCP.

Alternatives

Using TCP initally by default is another option. RFC-7766 notes:

TCP ought to be considered a valid alternative transport to UDP, not purely a fallback option.

But also describes possible disadvantages in Appendix A.

Testing

To observe TCP-retry behavior, use Wireshark to capture DNS. In my case, I disabled CloudFlare WARP to disable DNS-over-HTTPS and ran the following Python:

from pymongo import MongoClient
client = MongoClient("mongodb+srv://test1.kevinalbs.com")

There were 30 SRV records for _mongodb._tcp.test1.kevinalbs.com. This resulted in the UDP response being truncated. In my case, the TC bit is (expectedly) set and the TCP retry occurs:

I have not reliably reproduced the issue in HELP-59749 (UDP response is truncated, but TC bit not set).

How does this affect the end user?

In the case of HELP-59749, a changing subset of SRV records was applied each time SRV records are polled. I expect this results in repeated closing/opening of connections as servers are removed/added.

How likely is it that this problem or use case will occur?

This occurred in HELP-59749. I expect this impacts multiple drivers (PyMongo, Go, Rust, C, all queried with UDP first).

DNS and Truncation in UDP suggests this may not be isolated to the customer. However, the article suggests this impacts a small percentage of DNS environments.

If the problem does occur, what are the consequences and how severe are they?

In the case of HELP-59749, a changing subset of SRV records was applied each time SRV records are polled. SRV records had a TTL of one minute. I expect this results in repeated closing/opening of connections as servers are removed/added.

The truncated records result in less mongos servers being available for the driver to use. In the case of HELP-59729, 9 mongos servers were expected, 6 were applied due to truncation.

Is this issue urgent?

No? HELP-59749 is urgent, but a C-driver-specific solution was made in ~~CDRIVER-5589~~.

Acceptance Criteria

When implemented (and enabled), SRV records will be queried with TCP.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

udp-with-tcp-fallback.png
353 kB
Jun 03 2024 01:50:55 PM UTC

related to

CDRIVER-5589 Add option to prefer TCP for SRV lookup

Closed

Assignee:: Unassigned
Reporter:: Kevin Albertson
Votes:: 2 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Jun 03 2024 01:49:14 PM UTC
Updated:: May 01 2025 11:27:23 AM UTC

Details

Description

Summary

Background & Motivation

Alternatives

Testing

How does this affect the end user?

How likely is it that this problem or use case will occur?

If the problem does occur, what are the consequences and how severe are they?

Is this issue urgent?

Acceptance Criteria

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates