[CSHARP-4001] SRV and TXT DNS Failures - Duplicated DNS Response Breaks Subsequent DNS Response Read Created: 20/Dec/21  Updated: 28/Oct/23  Resolved: 31/Jan/22

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: None
Fix Version/s: 2.15.0

Type: Bug Priority: Major - P3
Reporter: Jack Alder Assignee: James Kovacs
Resolution: Fixed Votes: 6
Labels: Kubernetes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File test-broken.pcap     File test-working.pcap    
Issue Links:
Related
is related to CSHARP-3430 DnsClient.NET failures in Kubernetes ... Closed
Case:
Backwards Compatibility: Fully Compatible

 Description   

Summary

This issue only affects SRV and TXT lookups (mongodb+srv:// connection strings) in environments where DNS servers are incorrectly sending duplicate DNS responses for a single DNS request (Azure AKS in this reproduction).

As a result, SRV lookups will "fail" in the driver, leading to a server selection timeout with an empty Servers [] list.

Please provide the version of the driver. If applicable, please provide the MongoDB server version and topology (standalone, replica set, or sharded cluster).

MongoDB.Bson: 2.14.1
MongoDB.Driver: 2.14.1
MongoDB.Driver.Core: 2.14.1
DnsClient: 1.4.0 (also tested 1.5.0)

MongoDB Atlas 4.4.10 - M2 Replica Set

How to Reproduce

Deploy a C# dotnet application to an environment with a propensity for sending repeated/duplicated DNS responses to a single query. In this reproduction, we used Azure's AKS w/ CoreDNS (default) and k8s 1.21.
Container tests were built from both Ubuntu 18.04 and Ubuntu 20.04 with .NET 5.0.
Connect to the MongoDB deployment using mongodb+srv connection string. Repeat the connection attempts until DNS retransmissions are witnessed in a tcpdump. Expectation is a failure in SRV and/or TXT and followed by a server selection timeout.

Additional Background

The DnsClient response appears to be holding the second copy of the first request in queue. In this case, we are asking for TXT first followed by SRV, but the order of requests shouldn't matter as:
When the response for SRV is requested, the TXT (second response) is returned instead, leading to a DNS header mismatch warning in the DnsClient verbose logs. When not testing DnsClient, it surfaces as a server selection timeout and no other details are logged by the MongoDB driver.

I've linked CSHARP-3430 where I believe the root issue was not resolved. The exception was replaced with a warning, but the SRV to hosts list continues to fail.

It's important to note that A and AAAA lookups in a similar environment do not cause issues. As a workaround, you may use the mongodb:// connection string to avoid SRV and TXT lookups.



 Comments   
Comment by James Kovacs [ 01/Feb/22 ]

This change will be included with 2.15.0.

Comment by Githook User [ 01/Feb/22 ]

Author:

{'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}

Message: Revert "CSHARP-4001: Upgrade to DnsClient.NET 1.6.0, which includes the fix for duplicate DNS responses. (#725)"

This reverts commit 3a6dc880244f6b9b19b1a6558861bbfa6dd050be.
Branch: v2.14.x
https://github.com/mongodb/mongo-csharp-driver/commit/b429f49300b4612b6a02338b5a849f16bb9137e1

Comment by Githook User [ 31/Jan/22 ]

Author:

{'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}

Message: CSHARP-4001: Upgrade to DnsClient.NET 1.6.0, which includes the fix for duplicate DNS responses. (#725)
Branch: v2.14.x
https://github.com/mongodb/mongo-csharp-driver/commit/3a6dc880244f6b9b19b1a6558861bbfa6dd050be

Comment by Githook User [ 31/Jan/22 ]

Author:

{'name': 'James Kovacs', 'email': 'jkovacs@post.harvard.edu', 'username': 'JamesKovacs'}

Message: CSHARP-4001: Upgrade to DnsClient.NET 1.6.0, which includes the fix for duplicate DNS responses. (#725)
Branch: master
https://github.com/mongodb/mongo-csharp-driver/commit/4c67659af462289af104a126cd27eaf1e69345db

Comment by James Kovacs [ 05/Jan/22 ]

The root cause is issue #140 in DnsClient.NET, which I just reported along with a complete repro. Moving this ticket to blocked while waiting on a fix from DnsClient.NET.

Generated at Wed Feb 07 21:46:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.