Re: Maintenance mode

From: Rene Plattner <rene.plattner#uibk.ac.at>
Date: Thu, 11 Sep 2008 08:04:15 +0200


Willy Tarreau schrieb:
> Hi Alexander,
>
> On Wed, Sep 10, 2008 at 11:03:55PM +0200, Alexander Staubo wrote:

>> Guys, I would like to bring this subject up again. I have not been
>> able to work out a satisfactory solution to the problem.
>>
>> In a nutshell, when we -- my company -- perform a site update, we want
>> to display a static web page showing the current maintenance state. A
>> site update usually involves taking down all Rails processes, checking
>> out new code, and bringing Rails up again. So while this is going on,
>> HAProxy will attempt, and fail, to connect to its backend servers.
>>
>> There are a few possible solutions, all of them unsatisfactory:
>>
>> * Using "errorloc" or "errorfile" to show a static page on 503 errors.
>> That is what we are using in our current setup. This is unsatisfactory
>> because the 503 is an error that occurs in non-maintenance situations;
>> telling the user that the site is under maintenance when there's an
>> actual error condition just confuses everyone (on one particularly bad
>> day, people thought we are updating the site all the time and told us
>> to please stop doing it).
>>
>> * Another suggestion has been to use backup servers for this purporse.
>> This is unsatisfactory for the same reason that "error*" is.

>
> In fact, what would be possible right now would be to start your backup
> server only when your are putting your servers in maintenance mode. That
> way, it will return the maintenance page only if you're working on them.
> But I agree that it's not much satisfactory either.
>
>> * An iptables-based solution, suggested earlier, is too roundabout and
>> non-intuitive.

>
> I don't much like that one either for several reasons :
> - requires scripts because rules are hard to write and error-prone
> - detecting a problem in existing rules is hard, and there is a
> risk of leaving erroneous rules.
> - performance limitations
>
>> * Loading a separate configuration file is not appropriate because a
>> box may be running multiple sites, but we want to be able to put a
>> single site in maintenance mode without disturbing others.

>
> ... and it would require to permanently maintain multiple config files
> up to date.
>
>> * Changing the current config and sending a signal to reload HAProxy
>> is too intrusive and inconvenient. It can be done programmatically,
>> but involves maintaining some sort of master configuration file that
>> you filter through a preprocessor into the real config file. It's
>> icky.

>
> agreed.
>
>> Now, I have a suggestion for a proper solution, and if Willy likes it
>> I will try my hand at coughing up a patch. The idea is to support
>> user-defined variables that are settable at runtime. In the
>> configuration, these variables would be usable as ACLs:

>
> 100% agreed on internal variables. In another post, I even talked about making
> environment variables accessible.
>
>>   frontend http
>>     ...
>>     acl variable maintenance_mode true
>>     use_backend maintenance if maintenance_mode
>>
>> To control a variable you would invoke the haproxy binary:
>>
>>   $ haproxy -S maintenance_mode=true
>>
>> or
>>
>>   $ haproxy -S maintenance_mode=false
>>
>> Using shared memory for these variables is probably the easiest,
>> fastest and secure. It would be mapped into HAProxy's local address
>> space, so a lookup is essentially just a local memory read, cheap
>> enough to check on every request. Similarly, read and write access to
>> the variables could then be limited to the HAProxy user, if I remember
>> my POSIX shared memory semantics correctly.

>
> Using SHM would indeed be possible. There is just one thing I don't like
> with IPCs in general, it is that they are not cleaned up when a process
> dies. It's very common to find lots of remaining IPCs on a system where
> apps use them. So we have to find a way to ensure we either :
> - always clean them up
> - always use the same key (eg: put it in the conf)
>
>> Having such variables at hand would also let you do other tricks not
>> specifically related to maintenance. For example, you can have
>> external monitoring scripts that modify the behaviour of HAProxy based
>> on some sort of load parameter.

>
> There is still something to manage. If we use SHMs, we need to lock access
> to variables. Semaphores are out of question, as they're extremely expensive
> and dangerous. Spinlocks are possible and ideal but require that we link with
> libpthread, which is a new added dependency. We could also just count on atomic
> writes on integers on most architectures, and just read the variable twice to
> ensure that it is stable. This would be the cheapest in fact.
>
>> Thoughts?

>
> In the mean time, I could propose you an alternative : make use of environment
> variables in the configuration, and just reload your config.
>
> Your config would look like this :
>
> acl maintenance_mode env_int(HAPROXY_MAINT) gt 0
> use_backend maintenance if maintenance_mode
>
> When you want to put it in maintenance mode, simply restart it that way :
> $ HAPROXY_MAINT=1 haproxy -f /etc/haproxy.cfg -sf $(pidof haproxy)
>
> then perform your changes and when finished :
> $ HAPROXY_MAINT=0 haproxy -f /etc/haproxy.cfg -sf $(pidof haproxy)
>
> We could even add the ability to set internal variables on the command line :
> $ haproxy -f /etc/haproxy.cfg -s HAPROXY_MAINT=1 -sf $(pidof haproxy)
>
> The advantage is that we will need it one day or another, and it is very
> easy to implement (the acl function will be getenv() followed by atol()).
> Later, a CLI will make it possible to set/unset variables.
>
> What do you think about this ?
>
> Regards,
> Willy
>
>

Take this:

client --> stunnel, haproxy --> webmail, maintenence-site

frontend http_frontend_webmail
bind 127.0.0.1:1080
bind host:80
mode http
log global
option httplog
option logasap
option dontlognull
option forwardfor except 127.0.0.1
option httpclose
use_backend http_backend_webmail if TRUE use_backend http_backend_maint if TRUE
monitor-net nmgmt.uibk.ac.at
option clitcpka
maxconn 512
timeout client 120000

backend http_backend_webmail
mode http
balance roundrobin
cookie SERVERID indirect
option httpchk GET /haproxy.check
rspirep ^(Location:\ )(http:[/]*)(.*) \1https://\3 server webmail1 address:80 cookie wm01 check inter 10000 maxconn 256 server webmail2 address:80 cookie wm02 check inter 10000 maxconn 256 retries 3
option srvtcpka
timeout server 120000

backend http_backend_maint
mode http
balance roundrobin
reqirep ^(GET\ /)(\ .*) \1/webmail.html\2 server msite address:80 check inter 10000 maxconn 512 backup retries 3
option srvtcpka
timeout server 120000

If webmail is up all traffic goes to webmail. Otherwise the traffic goes to the msite backup server. We also made a HTTP-Header manipulation to load the webmail.html on the backup server.

Kind regards,

-- 
Dipl.-Ing. Rene' Plattner
Zentraler Informatikdienst (ZID)
Universität Innsbruck, Österreich
Technikerstr. 13
A-6020 Innsbruck
Tel: ++43512/507-2360
Fax: ++43512/507-2944
Received on 2008/09/11 08:04

This archive was generated by hypermail 2.2.0 : 2008/09/11 08:15 CEST