Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It strikes me that more and more a critical selection when growing in this way is the DNS part. It needs to be back-end agnostic and provide an increasing amount of functionality.

Health checks and failover are must have now, but this article makes me wonder three things:

1) Are there any DNS services that understand geography of your "zones", i.e. route to and failover based on IP? (but are still platform agnostic).

2) How long can a DNS failover take worst case? You can technically set a low TTL, but don't a lot of ISPs just increase that to a minimum?

3) Isn't it better to replace some of the DNS failover with high availability dedicated load balancing?



1) Yes, there are several DNS service providers that offer BGP anycast with geographically aware failover / load balancing. UltraDNS and DYN are the larger ones.

2) Yes, some ISPs do set a minimum TTL. Although BGP anycast is the most effective as the first line, sometimes it makes sense to have your reverse proxy caching layer override that distribution based on GeoIP and redirect to a more suitable proxy node closer to the client. This is especially the case when people using recursive lookup DNS servers that aren't necessarily geographically close to them (e.g. 8.8.8.8). It could also be useful in cases where TTL expiration hasn't caught up yet though.

3) No. Think of BGP Anycast DNS as distribution at a global level, and dedicated load balancers as distribution at the local level. You need to work out how to get the traffic to the load balancer first, and load balancing across distant geographies (high latency) results in horrible performance.


We use DNS-based load balancing along with an HA pair of load balancers in each datacenter. If the DNS health check fails, we stop sending traffic to a failing frontend LB. If failing LB is dead, we move its IP to the other one.

DNS TTL is not as big of an issue today as it was 5-10 years ago, when idiotic ISPs were trying to save on DNS resolving by ignoring TTLs. Nowadays you see an almost perfect drop in traffic when switching off a load balancer. Only bots and some weird exotic ISPs may keep sending traffic to a disabled box for up to an hour or two, but since DNS LB is only used to handle real emergency outages and for planned maintenance we could move LB IPs around, I really do not see it as a big enough issue to stop using the DNS LB magic :-)


Check out DynECT from http://www.dyn.com

Twitter, Mozilla and lots of other big names use them. I remember watching a webcast where Mozilla said they used Dyn's anycast failover service, with TTLs on their domains set to 5 seconds.

I've been using their DynECT entry level package ($30/month) for a couple of years and it's great.

Edit: you might also find this comment from an old thread interesting/useful: https://news.ycombinator.com/item?id=7813589 (go up two levels to phil21's first comment - HN isn't giving me a direct link sadly)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: