Hypermail

From: Piavlo <lolitushka#gmail.com>
Date: Sat, 06 Aug 2011 02:42:45 +0300

Well certainly aws has it's limitations which force you to design a very different infrastructure than you would in normal datacenter environment.
IMHO this is the great thing about those limitations as you are forced to start thinking differently and end up using a set of well known and established tools to
overcome those limitations. I'm talking mainly about monitoring/automation/deployment tools & centralized coordination service tools - so that you can automatically react to any change in the infrastructure.

With those tools you don't really care if some server ip changes - the ip only changes if you stop/start and start ec2 instance. If you reboot ec2 instance the ip does not change. But normally you would not really stop/start instance - this really happens then something bad happens to the instance, so that you need to reboot it, but reboot does not always works since there might be hardware problem on the server hosting this ec2 instance. So you need to stop it and then start - then you start it will start on different hardware server.

But you don't really need to all this stuff manually. If some ec2 instance is sick this is detected and propagated through the centralized coordination service to the relevant parties. Then you can decide to start a service from a failed instance on another already running ec2 instance or start new instance configure itself and start the service. The old failed instance can be just killed or suspended. (So VPC or normal datacenter will not help here - since the service will be running on different instance/server with different ip - yes you could use a floating ip in normal datacenter but you would not want to do that for every backend especially then backend are automatically added/removed. You would normally use floating ip for the frontend). Then service is active again on another/new instance - this is again propagated through the centralized coordination service. Then you automatically update needed stuff on relevant instances - like in this specific case update /etc/hosts and restart/reload haproxy. (All I wanted is to avoid haproxy restart/reload - there is no technical problem at all to do the restart). And of course all this is done automatically without human intervention.

From where I stand I see no unreliability problem with aws - the normal datacenter is just unreliable for me as aws. I don't need the normal datacenter or the VPC. The usage of those tools and the other aws features make aws much more attractive and reliable than normal datacenter.

The only really annoying thing about ec2 is that you can have only one ip per instance - this makes the HA stuff more difficult to implement and you have to design it differently that in normal datacenter. AFAIU the aws VPC would not help there too - since VPC instances still can have only one ip or/and you can't reassign it to another ec2 instance.

Alex

On 08/05/2011 11:53 PM, Hank A. Paulson wrote:
> I think the problem here is that the EC2 way of doing automatic server
> replacement is directly opposite normal and sane patterns of doing
> server changes in other environments. So someone on EC2 only is
> thinking this is a process to hook into and use and others, like
> Willie, are thinking wtf? why would you do this - I don't think there
> will be much common ground to be found.
>
> Did someone already mention the idea of a soft restart after some
> external process notices a dns/ip mapping change? Does a soft restart
> (-sf) re-read the hosts file or redo server dns name lookups?
> Presumably, your instances should not restart so frequently that
> simple soft restarts would become a problem - afaik.
>
> On 8/5/11 1:42 PM, Willy Tarreau wrote:
>> On Fri, Aug 05, 2011 at 11:11:50PM +0300, Piavlo wrote:
>>>> It's not a matter of config option. You're supposed to run haproxy
>>>> inside a chroot. It will then not have access to the resolver.
>>> There are simple ways to make the resolver work inside chroot without
>>> making the chroot less secure.
>>
>> I don't know any such simple way. If you're in a chroot, you have no
>> FS access so you can't use resolv.conf, nsswitch.conf, nor even load
>> the dynamic libs that are needed for that. The only thing you can do
>> then is to implement your own resolver and maintain a second config
>> for this one. This is not what I call a simple way.
>>
>>>> I could ask the question the other direction : why try to resolve a
>>>> name to IP when a check fails, there is no reason why a server would
>>>> have its address changed without the admin being responsible for it.
>>> I don't agree that admin is supposed to be responsible for it directly
>>> at all.
>>
>> So you're saying that you find it normal that a *server* changes its IP
>> address without the admin's consent ? I'm sorry but we'll never reach
>> an agreement there.
>>
>>> Say backend server crashes/enters bad state - this is detected and new
>>> ec2 instance is automatically spawned and autoconfigured to
>>> replace the failed backend ec2 instance- which is optionally
>>> terminated.
>>> The /etc/hosts of all relevent ec2 instances is auto updated (or DNS
>>> with 60 seconds ttl is updated - by the way the 60 seconds ttl works
>>> great withing ec2). There is no admin person involved - all is done
>>> automatically.
>>
>> That's what I'm explaining from the beginning : this *process* is
>> totally
>> broken and does not fit in any way in what I'd call common practices :
>>
>> - a failed server is replaced with another server with a different IP
>> address. It could very well have kept the same IP address. If
>> servers
>> in datacenters had their IP address randomly changed upon every
>> reboot
>> it would require many more men to handle them.
>>
>> - you're not even shoked that something changes the /etc/hosts of
>> all of
>> your servers when any server crashes. That's something I would
>> never
>> accept either. Of course, the only reason for this stupidity is the
>> point above.
>>
>> - on top of that the DNS is updated every 60 seconds. That means that
>> any process detecting the failure faster than the DNS updates will
>> act based on the old IP address and possibly never refresh it. Once
>> again, this is an ugly design imposed by the first point.
>>
>> I'm sorry Piavlo, but I can't accept such mechanisms. They are broken
>> from scratch, there is no other word. A server's admin should be the
>> only person who decides to change the server's address. Once you decide
>> to let stupid process change everything below you, you can't expect
>> some software to guess things for you and to automagically recover from
>> the mess.
>>
>>>> Also, in your case it would not fix the issue : resolving when the
>>>> server goes down will bring you the old address, and only after
>>>> caches expires it would bring the new one.
>>> If /etc/hosts is updated locally the is no need to wait for cache
>>> expiration.
>>
>> 1) /etc/hosts is out of reach in a chroot
>> 2) it's out of question to re-read /etc/hosts before every connection.
>> 3) if you don't recheck before every connection, you can connect to the
>> wrong place due to the time it takes to propagate changes.
>>
>>> And if /etc/hosts is auto updated by appropriate tool - going one more
>>> step of restarting/reloading haproxy is not a problem at all - but this
>>> is what I want to avoid.
>>
>> If you want to avoid this mess, simply configure your servers not to
>> change address with the phases of the moon.
>>
>>> If instead for example i could send a command to haproxy control socket
>>> to re-resolve all the names (or better just specific name)
>>> configured in
>>> haproxy - it would be much better - as since /etc/hosts is already
>>> updated it would resolve to correct ip address.
>>
>> It could not because it's not supposed to be present in the empty
>> chroot.
>>
>>> BTW afaiu adding/removing backends/frontends dynamically on the fly
>>> through some api / socket - is not something that is ever planned to be
>>> supported in haproxy?
>>
>> At the moment it's not planned because it requires to dynamically change
>> limits that are set upon startup, such as the max memory and max FD
>> number.
>> Maybe in the future we'll be able to start with a configurable margin to
>> add some servers, but that's not planned right now. Changing a server's
>> address by hand might be much easier to implement though, eventhough it
>> will obviously break some protocols (eg: RDP). But it could fit your
>> use case
>>
>> Regards,
>> Willy
>>
>>
>
Received on 2011/08/06 01:42

Re: make haproxy notice that backend server ip has changed