[PATCH]: Spread checks

From: Krzysztof Oledzki <ole#ans.pl>
Date: Sun, 23 Sep 2007 22:44:15 +0200 (CEST)

On Tue, 18 Sep 2007, Willy Tarreau wrote:

> On Tue, Sep 18, 2007 at 11:35:43AM +0200, Krzysztof Oledzki wrote:
>> I noticed that each server receive all checks in a very short (<1ms) time
>> (attached checklog2 file). I think that having 10s for 48 (16*3) tests it
>> is possible to reduce both servers' and loadbalancer stress a little, by
>> running each test every 10s/48 = ~ 0.2s.
>
> Yes, and this was even worse in the past because all servers for a same
> group were checked at the same instant. There were people doing load-balancing
> on same machines with multiple ports who got regular load peaks during the
> checks. So I have spread them apart within one backend.
>
> However, the problem still remains if you share the same server between
> many instances. I'm not sure how I could improve this. Maybe add a per-backend
> start delay for the checks, which would be equal to min_inter/#backends. As an
> alternative right now, you can rotate your servers within different backends.
>
> I think I could also add a global "spread-check" parameter allowing us to add
> some random time between all checks in order to spread them apart. It would
> take a percentage parameter adding or removing that many percent to the interval
> after each check.

Attached patch implements per-server start delay in a different way. Checks are now spread globally - not locally to one backend. It also makes them started faster - IMHO there is no need to add a 'server->inter' when calculating first execution. Calculation were moved from cfgparse.c to checks.c. There is a new function start_checks() and now it is not called when haproxy is started in MODE_CHECK.

With this patch it is also possible to set a global 'spread-check' parameter. It takes a percentage value (1..50, probably something near 5..10 is a good idea) so haproxy adds or removes that many percent to the oryginal interval after each check. My test shows that with 18 backends, 54 servers total and 10000ms/5% it takes about 45m to mix them completely.

I decided to use rand/srand pseudo-random number generator. I am aware it is not recommend for a good randomness but a) we do not need a good random generator here b) it is probably the most portable one.

Best regards,

                                 Krzysztof Olędzki

Received on 2007/09/23 22:44

This archive was generated by hypermail 2.2.0 : 2007/11/04 19:21 CET