Willy,
I find myself in the uncomfortable position of defending DNS based GSLB, I share some of your concerns on the subject, and have reviewed the documents you linked in the past. That said, I do believe that it offers some benefits, in some circumstances, allow me to explain inline.
> Anyway, DNS is horribly bad for high availability. I recently
> read a very
> good article on the subject that comforted me in this feeling
> I've been
> having for a long time. Basically, the problem find its roots
> in caches.
> DNS is only good to spread multiple IP addresses which are
> almost never
> updated. Use BGP the find where the IPs are located.
We don't expect this solution to respond to the issue of customers already connected to a dead site.
What we can do however, is ensure that new users will connect to a functioning site. Modern browsers, in the event that communication is lost, will eventually (30 seconds?) connect to other IPs that can be found in the RR response. If the client restarts their browser, and with a reasonable DNS ttl, they should end up connecting to a functional site as well.
The primary goal isn't so much to move people around based on load, but to cut out dead sites. A site will almost always only be dead if it drops its Internet connectivity, backend failure will affect both sites equally, front end failure will be considered, for the sake of argument, as completely mitigated :)
From the sound of things, for our secondary goal, DNS would be, in your opinion, a poor method of directing clients to the closest site.
>
> > > - Currently we aren't quite ready for it, but it would be very
> > > interesting to take BGP information, and use it to refer
> customers to
> > > the closest site, any ideas on this subject?
> >
> > Same. This would be the holy grail of scalability options for us.
>
> If you announce multiple IP addresses with your DNS, and if all those
> addresses are available on all sites, BGP will ensure that
> your customers
> will reach them on the closest site for them.
You seem to be suggesting using TCP with an anycast IP range available at multiple sites. There are articles on the subject [1] that suggest that this is relatively low risk (1 in 10000 requests?), but I always fear the consequences for a client who has a route flap.
On the subject of our traffic, it's short http sessions, however customer connectivity and diagnoses up to customers premises is completely critical.
What's your position on the subject of TCP anycast? I fear it would lead to difficult to impossible to diagnose client side issues.
Now imagine that the site is replicated with a link between
> the routers :
>
> [clients] --- [ cisco router ] --- [ alteon ] --- [ haproxy
> ] --- [ servers ]
> |
> |
> [clients] --- [ cisco router ] --- [ alteon ] --- [ haproxy
> ] --- [ servers ]
What kind of link between the two routers a dedicated point to point link, or would it be an encapsulated link of some type over Internet link, in which case, you are in a bad place once you drop the upstream connectivity.
Of course, another approach would be a high speed redundant back end link, independent of your internet service. In such a case, you could use anycast, but have the IPs answered by HAProxy only at the local site. As long as you don't drop your the backend link the potential route flapping issues of anycast would be completely mitigated.
You could also use heartbeat to bring up the remote sites IPs in the event that you completely lost the communication between the two sites.
Another question, why do you have the Alteons in that diagram, what benefit are they bringing into the equation?
> I don't know if my explanation was clear enough.
Quite clear, at least for those familiar with the domain :)
This is an extremely interesting discussion, I appreciate you taking the time to participate.
[1] http://readlist.com/lists/trapdoor.merit.edu/nanog/4/21854.html
-JohnF Received on 2007/12/19 18:04
This archive was generated by hypermail 2.2.0 : 2007/12/19 18:15 CET