[CSHARP-4934] Not automatically recovering after a failover Created: 17/Jan/24  Updated: 24/Jan/24  Resolved: 24/Jan/24

Status: Closed
Project: C# Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Unknown
Reporter: tim mcgrath Assignee: James Kovacs
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?


 Description   

Summary

After triggering a failover for an AWS document DB cluster with multiple nodes after the server recovers the C# driver does not work until restarted.

Please provide the version of the driver. If applicable, please provide the MongoDB server version and topology (standalone, replica set, or sharded cluster).

How to Reproduce

Driver version:  2.23.1
DB: AWS document DB, size intermediate with 3 nodes with the same priority
Server: app runner instance in VPC with AWS DocumentDB from above.

Using C# driver above in an asp.net server deployed to AWS app runner configured as above
1. Have a server running and handling write requests
2. Using failover-db-cluster to trigger a failover event (https://awscli.amazonaws.com/v2/documentation/api/latest/reference/rds/failover-db-cluster.html)
3. Keep the server responding to requests
 
Code used
 
var settings = MongoClientSettings.FromUrl(new MongoUrl(connectionString));
_client = new MongoClient(settings);

Put([FromBody] TestDocument document)

{   await collection.InsertOneAsync(document); }

 
Expected
Once the failover is complete it is expected that the server will be able to process requests again without restarting or recreating the mongo db client.
 
Actual
The server is not able to complete write requests again until restarted.
Other servers can connect and write afterward.
 
Workaround
The only way I found to automatically recover was to recreate the client with different settings then recovers after the fall over

Additional Background

Connection string options ssl=false&retryWrites=false&readPreference=SecondaryPreferred



 Comments   
Comment by tim mcgrath [ 24/Jan/24 ]

Ok, thank you for looking into it for me.

Comment by James Kovacs [ 24/Jan/24 ]

Hi, tim.mcgrath@objective.com,

Thank you for reporting this issue. When connected to a MongoDB cluster - either self-hosted or MongoDB Atlas - the .NET/C# Driver recovers successfully after failover events. Our test suite verifies a variety of failover scenarios. If rediscovery of the cluster primary is not successful when connected to AWS DocumentDB, that indicates a problem with AWS DocumentDB not emitting the expected responses to heartbeats. I would suggest logging MongoClient.Cluster.Description to compare the driver's view of the cluster topology with the current state of the cluster.

If you can reproduce this behaviour when connected to a self-hosted MongoDB cluster or MongoDB Atlas, we would be happy to investigate further. If this problem is specific to AWS DocumentDB, then please contact Amazon technical support to investigate further.

Sincerely,
James

Comment by tim mcgrath [ 17/Jan/24 ]

Sorry should have been failover rather the fall over and doesn't seem like I can edit the issue

Comment by PM Bot [ 17/Jan/24 ]

Hi tim.mcgrath@objective.com, thank you for reporting this issue! The team will look into it and get back to you soon.

Generated at Wed Feb 07 21:49:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.