-
Type: Task
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Atlas
-
Labels:
We have had a number of customers experiencing timeouts when connecting from their Applications hosted on Azure to their Atlas clusters. Updating our docs to include some steps customers can take to improve or eliminate the timeouts would be useful.
Possible reason for timeouts
Taking into account that you're hosting your application on Azure and the nature of the error that you're observing, it may be worth noting that the TCP keepalive on the Azure load balancer is 240 seconds by default, which can cause it to silently drop connections if the TCP keepalive on your Azure systems is greater than this value.
Some recommendations we typically make to customers:
- Adjusting the maxIdleTimeMS to 120000 should improve the issue.
- Updating application hosts tcp_keepalive_time to 120 can also help. Further information can be found in the Production Notes documentation.
- It can be worth looking into the possibility of having both the app server(s) and Atlas cluster residing in the same Azure region as this will also have a net positive impact.