Hypermail

From: Willy Tarreau <w#1wt.eu>
Date: Fri, 10 Jul 2009 07:00:22 +0200

Hi Craig,

On Thu, Jul 09, 2009 at 11:14:56PM +0200, Craig wrote:
> Hi Willy, hi list,
>
> I've thought about haproxy checks a bit lately.
> Here is my approach: don't do these checks in C.
>
> It's
> a) not fun to code string/checks for every protocol in C
> b) to time-inefficient
> c) not flexible enough
>
> Why re-invent the wheel?!
>
> Let's abuse nagios plugins, iptables and do some bash scripting.
>
> We'll use nagios-plugins to perform content checks on our services and
> if they are unavailable, we'll just firewall them; haproxy will be
> configured to do frequent tcp-checks only (100ms?!).
>
> Nagios plugins are standarized
> (http://nagiosplug.sourceforge.net/developer-guidelines.html)
> and offer a wide variation of functions.
(...)

> You would have to specify how frequently a service is checked and when
> it is considered up again; but IMHO that would be rather easy to add.
>
> I haven't tested the code, so please see it as an example; we'd read the
> configuration from a file and would iterate through a list of
> hosts/services and not just do a single check like in the example above.
> It's only meant to show you what I mean.
> I just didn't want to spent hours of coding to see a design flaw in it
> later.
>
> Any ideas, opinions on this?

For a long time we've been discussing how to implement *external* checkers. I must say that I had not thought about Nagios plugins, but this should be one idea among others.

However, these checks must not be called from haproxy, but some external testers must inform haproxy about the result of the test. There are several reasons for this. One of them simply is because haproxy runs chrooted without any priviledge so it will not even be able to execute such scripts. Another reason is that not all people want to use nagios plugins, everyone wants his own method, with the "expect" language being one of the most popular ones.

In principle, having some external tool informing haproxy about a server status (up/down) is not extremely difficult. The complexity comes from interactivity between haproxy and the tester. For instance, we've talked for a long time about a mechanism to speed up or slow down tests if a server returns a number of errors. So the tester must be able to dynamically get that information from haproxy.

If you look at the keepalived daemon, there are a lot of interesting tests in it too. We have already talked with Alex about a way to externalize them so that the tester could inform various daemons such as keepalived and haproxy, and also send notifications (which is also a problem in haproxy). Also the tester should get its server list from the daemons themselves. We did not have enough time to push the ideas too far, but I think that a protocol must first be defined then it could make sense to implement a tester using that protocol (which could very well use your nagios plugins for instance).

Also there are other things to consider. Some testers will only be able to return OK/FAILED for each test, and won't be able to implement counters, retries, ... So the interface must take that into account so that haproxy can take care of managing the up/down counters itself if the tester is not able to do so.

You're welcome to participate to the reflection on the subject if you're interested, as I think that none of us has al the keys to define something which would fit everyone's needs.

Regards,
Willy Received on 2009/07/10 07:00

Re: Haproxy scripted checks