Comment 18 for bug 1745463

Revision history for this message
Thayne (thayne-u) wrote :

> Can you help me understand the motivation underlying the requirement?

The motivation is that if the current dns server goes down, we don't have to wait for the request to timeout and make another request to the next server. Before resolved this was more of a problem since libc would always try to use the first server if it was down and didn't have any memory, so we would add the timeout for the dns lookup to each dns query. But I understand that resolved does remember if a server is down and continues to use that one, but afaict it never switches back, and that is a problem, as I'll explain below.

> Do you have DNS servers in your configuration which are always farther /
slower? If so, why do you have them in your configuration at all?

Yes. We have one DNS server in each availability zone (AWS). We would prefer to use the DNS server in the same availability zone as the host, but want to fall back to one of the others if that one becomes unavailable.

> Do you have a particular application requirement for low-latency DNS resolution - and if so, wouldn't the use of a caching local resolver (a configuration which resolved supports, and which we enable by default) have more of an impact on satisfying that requirement?

The less time it takes for our application servers to receive an answer the better. Bandwidth for DNS traffic is very cheap and the overhead for DNS CPU is also very inexpensive. We want to have the ability to use the fastest answer from an upstream DNS server as possible. We are also want redundancy so that we can remain unimpacted by planned and unplanned maintenance / issues. A local DNS cache helps with the fast response, but it does not help with an upstream servers availability.