Re: Sending requests to servers that are marked as down?

From: Kai Krueger <kakrueger#gmail.com>
Date: Wed, 03 Dec 2008 23:44:10 +0000


On 01/12/08 11:29, Willy Tarreau wrote:
> Hi Kai,
>
> On Mon, Dec 01, 2008 at 11:17:55AM +0000, Kai Krueger wrote:
>
>> I suspect another important aspect of why other people might not have
>> noticed this is that in our case, the backend server still respond with
>> correct HTTP in a timely manor when marked down, so the other mechanisms
>> that retry and redispatch sessions on connection errors didn't catch it.
>>
>
> I think you're right, this makes the bug even harder to trigger.
>
>
>>> Could you please try this patch ? If it works, I'll merge it and release
>>> two more versions.
>>>
>>>
>> I haven't had the chance to test it properly yet so this is a
>> preliminary conclusion, but I do get the impression that unfortunately
>> it didn't seem to work. I need to try it in my test setup again though,
>> as there is too much other stuff going on on the production server. I
>> did notice though that in the code there is one place, where
>> process_srv_queue() is called with out the may_qequeue_tasks() check and
>> that is in session_free(), which could potentially explain it?
>>
>
> Yes, it does. I remember why this one was not checked, it was because
> I did not want to leave unserved requests in the queue. But now that
> set_server_up() checks the queue, there should be no problem. Feel free
> to add a test above it.
>

Ok, I can confirm that with your original patch and the additional check things work correctly now and requests are no longer sent to the disabled servers.
Together with our patch to mark servers unavailable each time they return a 503 to the client, things are finally running smoothly and haproxy is doing a good job at balancing between the available servers. :-)
> Well in fact I see a very minor case where we could have a problem, it is
> if we change a server's weight from 0 to non-zero while it still has
> pending connections and no active one. We will not consume the queue,
> which will finally timeout. But the problem is already present and will
> need to be addressed by checking the queue upon every weight change.
>
> Regards,
> Willy
>
>
Received on 2008/12/04 00:44

This archive was generated by hypermail 2.2.0 : 2008/12/04 00:46 CET