Re: 1.3.17 in TCP mode sees dead servers (but they're not)

From: Nicolas MONNET <nicolas.monnet#ingenico.com>
Date: Thu, 07 May 2009 11:44:52 +0200


On Thu, 2009-05-07 at 06:26 +0200, Willy Tarreau wrote:

> generally this is caused by overloaded servers which can't manage to
> respond at all due to the amount of work they have in their backlog
> queue. Please add "maxconn 50" for instance on each "server" line to
> see if it changes anything. Also, what type of server are you using ?
> For instance, mongrel only accepts one request at a time and will not
> respond to any health-check while it's processing a long request, so
> with it you need "maxconn 1".

Turns out that the problem was specifying a source port address. It looks like it prevented haproxy from opening a TCP connection. I was using that to distinguish tests from regular connections on the destination server. I'm going to add an address on the interface and use that instead.

And the destination was either stunnel or an in-house app. If you need more info on where the problem was, let me know and I'll wade through the logs.

> > One question: couldn't it be possible to have redispatch work for
> TCP
> > connections?
>
> it does. However you have one particular config, you're using "balance
> source"
> with your TCP config. That means that when you redispatch the
> connection,
> you apply the LB algorithm again and you can only get back to the same
> server if it is still seen as up, because the size of the farm has not
> changed. There are two workarounds for this :
> - don't use "balance source" when not needed :-)
> - add enough retries to cover for the time to detect the server down,
> taking into account that each attempt waits at least 1 second.
>
> For the second solution, you can combine "inter" and "fastinter" to
> lower the failure detection time. For instance, "inter 5s fastinter
> 1s fall 2" will take 5 + 2*1 = 7s to see the server as down. So with
> at least 8 retries it should be OK. The redispatch will occur once
> the server has been taken out of the farm, so the source hash
> algorithm will bring you to another server.

Thanks a lot for your answers. Much appreciated.  

About Ingenico: Ingenico is the world’s leading provider of payment solutions, with over 15 million terminals deployed across the globe. Delivering the very latest secure electronic payment technologies, transaction management and the widest range of value added services, Ingenico is shaping the future direction of the payment solutions market. Leveraging on its global presence and local expertise, Ingenico is reinforcing its leadership by taking banks and businesses beyond payment through offering comprehensive solutions, a true source of differentiation and new revenues streams.  This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.  P Please consider the environment before printing this e-mail     Received on 2009/05/07 11:44

This archive was generated by hypermail 2.2.0 : 2009/05/07 12:00 CEST