Re: Sending requests to servers that are marked as down?

From: Kai Krueger <>
Date: Sun, 30 Nov 2008 01:28:05 +0000

Hello Willy,

thanks for the detailed reply.

On 29/11/08 20:32, Willy Tarreau wrote:
> Hello Kai,
> On Fri, Nov 28, 2008 at 11:55:51AM +0000, Kai Krueger wrote:
>> Hello list,
>> we are trying to set up haproxy as a load balancer between several
>> webservers running long standing database queries (on the order of
>> minutes). Altogether it is working quite nicely, however there are two
>> somewhat related issues causing many requests to fail with a 503 error.
>> As the backend webservers have some load that is outside the control of
>> haproxy, during some periods of time, requests fail immediately with a
>> 503 over load error from the backends. In order to circumvent this, we
>> use the httpchk option to monitor the servers and mark them as down when
>> they return 503 errors. However, there seem to be cases where requests
>> get routed to backends despite them being correctly recognized as down
>> (according to haproxy stats page) causing these requests to fail too.
>> Worse still is that due to the use of least connection scheduling the
>> entire queue gets immediately drained to this server once this happens,
>> causing all of requests in the queue to fail with 503.
>> I haven't identified under what circumstances this exactly happens, as
>> most of the time it works correctly. One guess would be that the issue
>> may be something to do with that there are still several connections
>> open to the server when it gets marked down which happily continue to
>> run to completion.
> From your description, it looks like this is what is happening. However,
> this must be independant on the LB algo. What might be happening though,
> is that the last server to be identified as failing gets all the requests.

Is there a way to try and find out if this is the case? I think at those times, there were still other backends up.

Bellow I have a log which I think captures the effect. (Similar config as previously, but some maxconn and check inter settings tuned to make it easier to trigger this and it was run on a different machine)

Nov 30 00:21:31 aiputerlx haproxy[10400]: Proxy ROMA started.
Nov 30 00:21:45 aiputerlx haproxy[10400]: 
[30/Nov/2008:00:21:39.125] ROMA ROMA/mat 0/0/0/2339/6830 200 8909913 - - 
---- 3/3/3/1/0 0/0 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0" Nov 30 00:21:47 aiputerlx haproxy[10400]: Server ROMA/mat is DOWN. 2 active and 0 backup servers left. 2 sessions active, 0 requeued, 0 remaining in queue.
Nov 30 00:21:55 aiputerlx haproxy[10400]: [30/Nov/2008:00:21:45.544] ROMA ROMA/mat 0/0/0/4754/9698 200 8909913 - - ---- 4/4/4/1/0 0/0 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0" Nov 30 00:21:56 aiputerlx haproxy[10400]: [30/Nov/2008:00:21:46.908] ROMA ROMA/mat 2/0/0/3830/8578 200 8909913 - - ---- 3/3/3/1/0 0/0 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0" Nov 30 00:22:03 aiputerlx haproxy[10400]: [30/Nov/2008:00:21:48.399] ROMA ROMA/mat 0/6843/0/3320/14661 200 8909913 - - ---- 2/2/2/0/0 0/1 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0"
Nov 30 00:22:05 aiputerlx haproxy[10400]: [30/Nov/2008:00:21:44.025] ROMA ROMA/mat 0/0/13/10559/21837 200 8938528 - - ---- 1/1/1/0/0 0/0 "GET /api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0"
Nov 30 00:22:18 aiputerlx haproxy[10400]: [30/Nov/2008:00:21:42.519] ROMA ROMA/Quiky 0/0/130/3176/35580 200 8938503 - - ---- 0/0/0/0/0 0/0 "GET
/api/0.5/map?bbox=8.20,49.07,8.50,49.30 HTTP/1.0" Nov 30 00:22:26 aiputerlx haproxy[10400]: Server ROMA/mat is UP. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

According to the log, Server ROMA/mat is DOWN at 00:21:47, however the 4th request which was sent at 00:21:48 was still queued to ROMA/mat which at that time was still down, as it is not up till 00:22:26. It might have received a valid health check message and might have been down going up. Am I perhaps misreading the logs here?
>> Is there a way to prevent this from happening?
> Not really since the connections are already established, it's too late.
> You can shorten your srvtimeout though. It will abort the requests earlier.
> Yours are really huge (20mn for HTTP is really huge). But anyway, once
> connected to the server, the request is immediately sent, and even if you
> close, you only close one-way so that server will still process the request.

The backend servers have to query a database to generate the response and some of these replies can occasionally take up to 10 - 20 minutes to process. This is why we chose the very long request timeout so that even these can be handled. Th
>> The second question is regarding requeuing. As the load checks fluctuate
>> quite rapidly periodically querying the backends to see if they are
>> overloaded seems somewhat too slow, leaving a window open between when
>> the backends reject requests and until haproxy notices this and takes
>> down that server. More ideally would be for haproxy to recognize the
>> error and automatically requeue the request to a different backend.
> In fact, what I wanted to add in the past, was the ability to either
> modulate weights depending on the error ratios, or speed-up health
> checks when an error has been detected in a response, so that a failing
> server can be identified faster.

That would probably be useful to detect these problems and take the backend server offline faster. I had a similar idea and tried hacking the source code a bit. I ended up adding a call to set_server_down() in process_srv() of proto_http.c if the reply is a 500 error. It seemed to work sofar, but I haven't understood the code well enough to know if this is even closely valid or if it has nasty race conditions or other problems. Is this a safe thing to do?
> Also, is it on purpose that your "inter"
> parameter is set to 10 seconds ? You need at least 20 seconds in your
> config to detect a faulty server. Depending on the load or your site,
> this might impact a huge number of requests. Isn't it possible to set
> shorter intervals ? I commonly use 1s, and sometimes even 100ms on some
> servers (with a higher fall parameter). Of course it depends on the work
> performed on the server for each health-check.

It could probably be reduced, somewhat, although if there is a queue even a short interval may see a lot of failed requests, as it can drain away the entire queue in the mean time.
>> At
>> the moment haproxy seems to pass through all errors directly to the
>> client. Is there a way to configure haproxy to requeue on errors?
> clearly, no. It can as long as the connection has not been established
> to the server. Once established, the request begins to flow towards the
> server, so it's too late.
>> I
>> think I have read somewhere that haproxy doesn't requeue because it does
>> not know if it is safe, however these databases are completely read only
>> and thus we know that it is safe to requeue, as the requests have no
>> side effects.
> There are two problems to replay a request. The first one is that the
> request is not in the buffer anymore once the connection is established
> to the server. We could imagine mechanisms to block it under some
> circumstances. The second problem is that as you say, haproxy cannot
> know which requests are safe to replay. HTTP defines idempotent methods
> such as GET, PUT, DELETE, ... which are normally safe to replay.
> Obviously now GET is not safe anymore, and with cookies it's even
> more complicated because a backend server can identify a session
> for a request and make that session's state progress for each request.
> Also imagine what can happen if the user presses Stop, clicks a different
> link, and by this time, haproxy finally gives up on former request and
> replays it on another server. You can seriously affect the usability of
> a site or even its security (you don't want a login page to be replayed
> past the logout link for instance).
> So there are a lot of complex cases where replaying is very dangerous.

It is true that in many cases it is dangerous to requeue requests, but it would be nice if there were a configuration parameter with which one could tell haproxy that in this particular case one knows for sure that requeueing is safe.
> I'd be more tempted by adding an option to not return anything upon
> certain types of errors, so that the browser can decide itself whether
> to replay or not. Oh BTW you can already do that using "errorfile".
> Simply return an empty file for 502, 503 and 504, and your clients
> will decide whether they should replay or not when a server fails to
> respond.

Well, I had hoped to use haproxy to mask these errors and provide a higher availability and not shift the burden of retrying to the client.
>> P.S. In case it helps, I have attached the configuration we use with
>> haproxy (version, running on freeBSD)
> Thanks, that's a good reflex, people often forget that important part ;-)


Here are the things that changed in the config that generated the log file above and it was this time run on a linux box

server mat localhost:80 maxconn 6 weight 10 check inter 1s rise 30 fall 1 server Quiky maxconn 1 weight 10 check inter 10s rise 3 fall 1
server mat maxconn 1 weight 10 check inter 10s rise 3 fall 1
> Regards,
> Willy

Kai Received on 2008/11/30 02:28

This archive was generated by hypermail 2.2.0 : 2008/11/30 02:30 CET