Re: Truncated health check response from real servers

From: Willy Tarreau <w#1wt.eu>
Date: Wed, 10 Feb 2010 23:02:44 +0100


Hi Nick,

On Wed, Feb 10, 2010 at 04:10:46PM +0000, Nick Chalk wrote:
> Hello.
>
> I wonder if anyone can assist with this problem, reported by one of
> our customers.
>
> The load balancer is running HAProxy 1.4-rc1, with a modified version
> of the HTTP ECV patch applied. The customer is using ECV to check the
> status of a pair of IIS web servers:

Did you managed to fix the several remaining issues which could cause it to crash the process ?

> listen web 10.3.4.150:80
> mode tcp
> option httpchk GET /gccheck.cfm HTTP/1.0
> http-check expect rstring gCokay
> balance source
> server realserver1 10.4.1.6:80 weight 1 check inter 2000 rise 2 fall 3
> server realserver2 10.4.1.16:80 weight 1 check inter 2000 rise 2 fall 3
> server backup 127.0.0.1:9081 backup
> option redispatch
> option abortonclose
> maxconn 40000
> log global
>
> We are seeing both real servers repeatedly going on- and off-line with
> a period of tens of seconds. Packet tracing, stracing, and adding
> debug code to HAProxy itself has revealed that the real servers are
> always responding correctly, but HAProxy is sometimes receiving only
> part of the response.

Indeed, the checks are rather simple right now, they parse the response at once. Krzysztof Oledsky proposed a patch to make use of recv(MSG_PEEK) in order to leave incomplete data in kernel buffers instead of consuming it. I don't think it could have any side effect, you may want to try it. It was about 3-4 weeks ago on the list.

Regards,
Willy Received on 2010/02/10 23:02

This archive was generated by hypermail 2.2.0 : 2010/02/10 23:15 CET