Having redundant internet connections and a DNS fail-over service is essential for hosting our in-house web servers and other web services. If our internet connection goes down for any length of time we start to feel like Tom Hanks on his island in Castaway.
Why redundant internet connections?
The benefit of redundant internet connections is obvious, if the primary connection goes down, the backup takes over automatically with minimal disruption.
Right now, we’re using business cable modem service for our primary connection, with five IP addresses. Our secondary connection is metro ethernet, also with five IPs.
How do you switch between the connections? You need a firewall with WAN connection redundancy. We use a SonicWall TZ 205 firewall. It’s configured with both internet connections, and it will detect if one of the connections can no longer route packets. It will then seamlessly switch over and use the other. The connections are prioritized, so when the preferred connection comes back online, it will switch back. We prioritize the connections because our cable modem runs at 50mb/20mb, but our metro-e is 18mb/5mb, so we want to use the cable connection if we can.
What about DNS?
Now, we host web services in-house, including web servers, terminal servers, and other services. Our external DNS server points to the IP addresses on our primary connection. The problem is that when the firewall fails over to the backup connection our external IP addresses will change. This doesn’t really affect our web access from inside the building. However, it does affect all of the customers accessing our web services from outside.
What is DNS Fail-Over?
We address this by using DNS fail-over. We use a hosted DNS service from DNS Made Easy, and for each DNS entry we can define a set of fail-over criteria. If this criteria is met the service will replace the IP address on the DNS entry. Here’s their tutorial on how it works.
In our case, the criteria for each domain entry is a rule that tries to retrieve a small file from a solid-state web server in our building. If it can retrieve that file via the primary IP, all is well. If it can’t it then tries to retrieve the file via the corresponding backup IP. (Remember, we have five IP addresses on each connection) If it can retrieve the file via the backup IP, it switches the DNS entry to use the backup IP. At this point, our DNS entries point to the backup internet connections and everything works. Our services are available via their domain names no matter which set of IP’s are active.
One problem with having the DNS Fail-over ping a web server is that if the web server goes down for an update, or any other reason, the DNS service will detect that as a failure of the primary IP and it will try to switch over to the backup internet connection. That’s why we use a solid-state web server. The solid-state web server we use? It’s an old Hawking Print Server. It’s web-based interface makes a perfect web site for the DNS Fail-over to ping to see if our primary interface is up. It also has no moving parts and doesn’t need to be taken offline for service packs or upgrades.
One detail to consider is that DNS entries are cached according to their time-to-live (TTL) For this fail-over to work in a timely fashion, we need the cached entries to expire fairly quickly. We use a lower TTL, such as 300, for our domain records that need high availability.
As with any fail-over, the DNS service continuously checks to see if the primary IP has come back on-line, and if so, it switches the DNS back.
The Bottom Line
The cost for business cable modem service, DSL, and metro-e service is very competitive, and cost of a suitable firewall is usually under $500. If you host any web services in-house, you should also look into DNS fail-over. Again, it’s a very inexpensive service, we pay about $180 a year for external DNS with fail-over service.
If you don’t use redundant internet connections at your business I urge you check into it.
Awesome! Thanks for the explanation. I am trying to put this together now for our company but could not figure out how it was done as far as DNS records are concerned besides manually changing each one of them to the secondary set of IP’s during the time of the outage.